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(57) Abstract 

Compounds and methods for inducing protective immunity against tuberculosis are disclosed. The compounds provided include 
polypeptides that contain at least one immunogenic portion of one or more M, tuberculosis proteins and DNA molecules encoding 
such polypeptides. Such compounds may be formulated into vaccines and/or pharmaceutical compositions for immunization against M. 
tuberculosis infection, or may be used for the diagnosis of tuberculosis. 
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COMPOUNDS AND METHODS FOR IMMUNOTHERAPY 
AND DIAGNOSIS OF TUBERCULOSIS 

5 CROSS-REFERENCE TO RELATED APPLICATIONS 

This application is a continuation-in-part of U.S. Application 
No. 9/025,197, filed February 18, 1998; which is a continuation-in-part of 
U.S. Application No. 08/942,578, filed October 1, 1997; which is a continuation-in-part 
of U.S. Application No. 08/818,1 12, filed March 13, 1997; which is a continuation-in- 

10 part of U.S. Application No. 08/730.5 1 0. filed October 11.1 996; which claims priority 
from PCT Application No. PCT/US 96/14674, filed August 30, 1996; and is a 
continuation-in-part of U.S. Application No. 08/680,574, filed July 12, 1996; which is a 
continuation-in-part of U.S. Application No. 08/659,683, filed June 5. 1996; which is a 
continuation-in-part of U.S. Application No. 08/620,874, filed March 22. 1996. now 

15 abandoned; which is a continuation-in-part of U.S. Application No. 08/533.634, filed 
September 22, 1995. now abandoned; which is a continuation-in-part of 
U.S. Application No. 08/523.436, filed September 1. 1995, now abandoned. 
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TECHNICAL FIELD 

The present invention relates generally to detecting, treating and 
preventing Mycobacterium tuberculosis infection. The invention is more particularly 
related to polypeptides comprising a Mycobacterium tuberculosis antigen, or a portion 
or other variant thereof, and the use of such polypeptides for diagnosing and vaccinating 
against Mycobacterium tuberculosis infection. 



BACKGROUND OF THE INVENTION 

Tuberculosis is a chronic, infectious disease, that is generally caused by 
infection with Mycobacterium tuberculosis. It is a major disease in developing 
countries, as well as an increasing problem in developed areas of the world, with about 
30 8 million new cases and 3 million deaths each year. Although the infection may be 
asymptomatic for a considerable period of time, the disease is most commonlv 
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manifested as an acme ittftornation of the togs, resulting in fever and a nonproductive 
cough. If left untreated, serious complications and death typically result. 

Although tuberculosis can generally be controlled using extended 
an„b,ouc therapy, such treatment is no, sufficient to prevent the spread of the disease 
5 Infected individuals may be asymptomatic, bu, contagious, for some time. In add,tio„ 
although compliance with toe Teamen, regimen is critical, patient behavior is difficuh 
•o m„m,„r. Some patients do not complete the course of treatment. „h, c h can lead to 
ineffective treatment and the development of drug resistance. 

Inhibiting the spread of tuberculosis requires effective vaccination and 
o accurate, early diagnosis of the disease. Currently, vaccination with iive bacteria is the 
most effic.ent method for inducing protective immumty. The mo, common 
Myeobactenum employed for mis purpose is BacUlus Calmette-Ouerin (BCG) an 
avtrulen, strain .f.**, rt , 4<JV( , Hoirever , ^ _ ^ rf ^ ^ 

source of controversy and some countries, such as the United States, do no, vaccinate 
the general pubUc. Diagnosis is commonly achieved using a skin tesu which involves 
tmradermal exposure to tuberculin PPD (protein-purified derivative). Antiaen-specific 
T cell responses result in measurable induration at the injection site bv 48-72 hours after 
mjecuon. which mdfeues exposure to Mycobacterial antigen, Sensitive and 
spectficty have, however, been a problem with mis test, and mdividuals vaccinated 
with BCG cannot be distinguished from infected indivtduals. 

While macrophages have been shown to act as the princpal effectors of 
U miosis .mrnumty. T cells are the predomman, inducers of such immunity The 
essential role of T celts in protection ^ mi , erciite ,. ^ ^ 

»e freouen, occurrence of * .uteres in AIDS patients, due to the depletion of 
4 T " llS aSMdm;d ^ immunodeficency virus (HIV) infection 
Mycobacterium™ CD4 T ceHs have been showr, ,„ be poten, producers of 
gamma-interferon (IFN-y), which, in mm, has been shown to trigger the ar,ti- 
mycobactenal effects of macrophages in' mice. Wh„e the n* of IFN-y in humans ,s 
less dear, studies have shown that ,.25-dihydroxy-v,,arn,„ D3. either alone or in 
comb,na,,o„ wtth IFN- y or tumor necros.s factor-alph, activates human macrophages 
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to mhtbn M. tuberculosis infection. Furthermore, it is known that IFN-y stimulates 
human macrophages to make 1,25-dihydroxy-vrtarnin D3. Similarly, IL-12 has been 
shown to play a role in stimulating resistance to M. tuberculosis infection. For a review 
of the unmunology of M. tuberculosis infection see Chan and Kaufmann in 
Tuberculosis: Pathogenesis, Protection and Control, Bloom (ed.), ASM Press 
Washington, DC, 1994. 

Accordingly, there is a need in the art for improved vaccines and 
methods for preventing, treating and detecting tuberculosis. The present invention 
fulfills these needs and further provides other related advantages. 

SUMMARY OF THE INVENTION 

Briefly stated, this invention provides compounds and methods for 
preventmg and diagnosing tuberculosis. ,n one aspect, polypeptides are provided 
compnsmg an immunogemc portion of a soluble M. tuberculosis antigen, or a variant of 
such an antigen that differs only in conservative substitutions and/or modifications In 
one embodiment of this aspect, the soluble antigen has one of the following N-terminal 
sequences: 

(a) Asp-Pro-Val-Asp-Ala-Val-ne-Asn-Thr-Thr-Cys-Asn-Tyr-Gly- 
Gln-Val-Val-Ala-Ala-Leu: (SEQ ID No. 120) 

(b) Ala-Val-Glu-Ser-Glv-Met-Leu-Ala-Leu-Gly--nir-Pro-.Ala.Pro- 
Ser:(SEQ ID No. 121) 

(c) Ala-Ala-Met-Lys-Pro-Arg-Thr-Gly-Asp-Gly-Pro-Leu-Glu-Ala- 
Ala-Lys-Glu-Gly-Arg; (SEQ ID No. 122) 

(d) Tyr-Tyr-Trp-Cys-Pro-Gly-Gln-Pro-Phe-Asp-Pro-Ala-Trp-Gly- 
Pro; (SEQ ID No. 123) 

(e) Asp-Ile-Glv-Ser-GIu-Ser-Thr-Glu-Asp-Gln-Gln-Xaa-Ala-Val: 
(SEQ ID No. 124) 

(f) Ala-Glu-Glu-Serllle-Ser-Thr-Xaa-Glu-Xaa-Ile-Val-Pro; (SEQ ID 
No. 125) 

(g) As P-Pro-Glu-Pro-Ala-Pro-Pro-Val-Pro-Thr-Thr-Ala-Ala-Ser- 

Pro-Pro-Ser: (SEQ ID No. 126) 
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(h) ^o-Lys^-Tyr-Xaa-Glu-Glu-Leu-Lys-Glv-Thr-Asp-^ 
Gly;(SEQIDNo. 127) 

(i) ^P-P^AJa-Ser-AJa-Pro-Asp-Val-Pro-Tl.r-AJa-Ala-Gln-Leu- 

TTir.Ser-Leu-Leu.Asn.Ser-Leu-Ala-As^Pro-Asn-Val-Ser-Phe- 
Ala-Asn; (SEQ ID No. 128) 

Sen (SEQ ID No. 134) 

Asp; (SEQ ID No. 135) or 
(1) ^°-Glu-Ser-Gly-Ala^ 
Gly; (SEQ ID No. 136) 
wherein Xaa may be any amino acid. 

In a related aspect, polypeptides are provided comprising an 
^emcpornonofan.V, -*^. IBhlflli 

oteT m SUbSdtUti0nS — - «*- one 

of the following N-terminal sequences: 

(m) ^-Tyr-Ile-Ala-Ty^ 

He-Asn-Val-His-Leu-Val: (SEQ ID No. 137) or 

00 Asp-Pro-Pro-As P -P r o-His-Gln-Xaa-Asp-Met--rhr-Lys-Glv-Tvr- 
Tyr-Pro-Gly-Gly-Arg-Arg-Xaa-Phe: (SEQ ID No P9) 
wherein Xaa may be any ammo acid. 

the s quences ^ m SEQ ID No, 2 . 4 , 0 . 13 . 25 , 52 , 99 ^ 101 ^ 
comments of sa.d seances, and DNA sequences that hybrids to a sequence 
recited in SEQ ID Nos ■ 1 "> 4 in ->< ™ ^quence 

undermod , • " ' ' " ^ 101 W * C ° mp,emei " ^ 

under moderately stringent conditions. 

Inarclated " PeC ^ e P^^ 
a U tutercutosis antigen, or a vanant of such an ant lg en that differs only in 
conserve subst.tu.ons and/or modifications, wherein me antigen com pnses m 



WO 99/42076 



5 



PCT/US99/03268 



10 



amino acid sequence encoded by a DNA sequence selected from the group consisting of 
the sequences recited in SEQ ID Nos.: 26-51, 138, 139, 163-183. 201, 240, 242-247. 
253-256, 295-298, 309, 316, 318-320, 322, 324, 328, 339, 333, 335, 337, 339 and 341, 
the complements of said sequences, and DNA sequences that hybridize to a sequence 
recited in SEQ ID Nos.: 26-51, 138, 139, 163-183, 201, 240, 242-247, 253-256, 295- 
298, 309, 316, 318-320, 322, 324, 328, 329, 333, 335, 337, 339 and 341or a 
complement thereof under moderately stringent conditions. 

In related aspects, DNA sequences encoding the above polypeptides, 
expression vectors comprising these DNA sequences and host cells transformed or 
transfected with such expression vectors are also provided. 

In another aspect, the present invention provides fusion proteins 
comprising a first and a second inventive polypeptide or, alternatively, an inventive 
polypeptide and a known M. tuberculosis antigen. 

Within other aspects, the present invention provides pharmaceutical 
compositions that comprise one or more of the above polypeptides, or a DNA molecule 
encoding such polypeptides, and a physiologically acceptable carrier. The invention 
also provides vaccines comprising one or more of the polypeptides as described above 
and a non-specific immune response enhancer, together with vaccines comprising one 
or more DNA sequences encoding such polypeptides and a non-specific immune 
response enhancer. 

In yet another aspect, methods are provided for inducing protective 
immunity in a patient, comprising administering to a patient an effective amount of one 
or more of the above polypeptides. 

In further aspects of this invention, methods and diagnostic kits are 
provided for detecting tuberculosis in a patient. The methods comprise contacnng 
dermal cells of a patient with one or more of the above polypeptides and detecting an 
immune response on the patient's skin. The diagnostic kits comprise one or more of the 
above polypeptides in combination with an apparatus sufficient to contact the 
polypeptide with the dermal cells of a patient. 



WO 99/42076 



PCT/US99/03268 



In yet other aspects, methods are provided for detecting tuberculosis in a 
patient such methods comprising contacting dermal cells of a patient with one or more 
polypeptides encoded by a DNA sequence selected from the group consisting of SEQ 
IDNos.: 3, 11, 12, 140, 141, 156-160, 189-193, 199,200,203,215-225,237 239 261 
5 276, 292, 293, 303-308, 310-315, 317, 321, 323, 325-327, 330-332, 334, 336, 338, 340 
and 342-347, the complements of said sequences, and DNA sequences that hybridize to 
a sequence recited in SEQ ID Nos, 3, 11, 12, 140, 141, 156-160, 189-193, 199, 200 
203, 215-225, 237, 239, 261-276, 292, 293, 303-308, 310-315, 317, 321, 323, 325-327,' 
330-332, 334, 336, 338, 340 and 342-347; and detecting an immune response on the 
' patient's skin. Diagnostic kits for use in such methods are also provided. 

These and other aspects of the present invention will become apparent 
upon reference to the following detailed description and attached drawings. All 
references disclosed herein are hereby incorporated by reference in their entirety as if 

each was incorporated individually. 

BRIEF DESCRIPTION OF THE DRAWINGS AND SEQUENCE IDENTIFIERS 

Figure 1 A and B illustrate the stimulation of proliferation and interferon- 
7 Production in T cells denved from a first and a second M. tuberculosis-^^ donor 
respectively, by the 14 Kd. 20 Kd and 26 Kd antigens described in Example 1 . 

Figure 2 illustrates the stimulation of proliferation and interferon-v 
production in T cells derived from an M. tuberculosis.mmmc mdividual by the two 
representative polypeptides TbRa3 and TbRa9. 

Figures 3A-D illustrate the reactivity of antisera raised against secretory 
M. tuberculosis proteins, the known M. tuberculosis antigen 85b and the inventive 
antigens Tb38-1 and TbH-9, respectively, with M. tuberculosis lysate (lane 2), M. 
tuberculosis secretory protems (lane 3), recombinant Tb38-1 (lane 4), recombinant 
TbH-9 (lane 5) and recombinant 85b (lane 5). 

Figure 4A illustrates the stimulation of proliferation in a TbH-9-specific 
T cell clone by secretory M. tuberculosa proteins, recombinant TbH-9 and a control 
antigen. TbRall. 
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« T ceU Cone by secretory M. ^ ppD ^ 

- interferon™ d^ B ""^ *• * »d 

m.erferon-y product™ m T ^ ^ ^ ^ 

F.guns 6A „, B muarate ^ • 

UKerferon-y production in TK8-,-sp=cific T ceifa bv the s, • ■ ^ B "° n ""' 

H II1C 1 cells b y fusion protein TbH9-Tb38 1 

interferon, T" ^ ^ " ^ *" ***** * - 

1 by the fusion protein TbH9-Tb38- 1 . 

Figures 8A and B illustrate the stimulation of proliferation _ 
mterferon-y production in T cells deriveH * * proWeratl0n 

individual bv .he 3 firSt ^ ^^"-inimune 

^ual by the representative poises XP-1, RDIF6, RDIF8, RD IFl0 ^ 

.terferon y " ' ** — " ~on and 

Z^:T 0n " T CdlS denVCd ^ 3 — ----inunune 
J+* * the represent polypeptides xp,, ^ ^ ^ ^ 

SEQ. ID NO. 1 is the DNA sequence of TbRal. 

SEQ. ID NO. 2 is the DNA sequence of TbRalO. 

SEQ. ID NO. 3 is the DNA sequence of TbRal 1 . 

SEQ. ID NO. 4 is the DNA sequence of TbRal2. 

SEQ. ID NO. 5 is the DNA sequence of TbRal3. 
SEQ. ID NO. 6 is the DNA sequence of TbRal6. 
SEQ. ID NO. 7 is the DNA sequence of TbRal 7. 
SEQ. ID NO. 8 is the DNA sequence of TbRal 8. 
SEQ. ID NO. 9 is the DNA sequence of TbRal 9 
SEQ. ID NO. 10 is the DNA sequence of TbRa24. 
SEQ. ID NO. 1 1 is the DNA sequence of TbRa26 
SEQ. ID NO. 12 is the DNA sequence of TbRa28. 
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SEQ. ID NO. 13 is the DNA sequence of TbRa29. 
SEQ. ID NO. 14 is the DNA sequence of TbRa2A. 
SEQ. ID NO. 15 is the DNA sequence of TbRa3. 
SEQ. ID NO. 16 is the DNA sequence of TbRa32. 
SEQ. ID NO. 17 is the DNA sequence of TbRa35. 
SEQ. ID NO. 18 is the DNA sequence of TbRa36. 
SEQ. ID NO. 19 is the DNA sequence of TbRa4. 
SEQ. ID NO. 20 is the DNA sequence of TbRa9. 
SEQ. ID NO. 21 is the DNA sequence of TbRaB. 
SEQ. ID NO. 22 is the DNA sequence of TbRaC. 
SEQ. ID NO. 23 is the DNA sequence of TbRaD. 
SEQ. ID NO. 24 is the DNA sequence of YYWCPG 
SEQ. ID NO. 25 is the DNA sequence of AAMK. 
SEQ. ID NO. 26 is the DNA sequence of TbL-23. 
SEQ. ID NO. 27 is the DNA sequence of TbL-24. 
SEQ. ID NO. 28 is the DNA sequence of TbL-25. 
SEQ. ID NO. 29 is the DNA sequence of TbL-28. 
SEQ. ID NO. 30 is the DNA sequence of TbL-29. 
SEQ. ID NO. 31 is the DNA sequence of TbH-5. 
SEQ. ID NO. 32 is the DNA sequence of TbH-8. 
SEQ. ID NO. 33 is the DNA sequence of TbH-9. 
SEQ. ID NO. 34 is the DNA sequence of TbM-1. 
SEQ. ID NO. 35 is the DNA sequence of TbM-3. 
SEQ. ID NO. 36 is the DNA sequence of TbM-6. 
SEQ. ID NO. 37 is the DNA sequence of TbM-7. 
SEQ. ID NO. 38 is the DNA sequence of TbM-9. 
SEQ. ID NO. 39 is the DNA sequence of TbM-12. 
SEQ. ID NO. 40 is the DNA sequence of TbM-13. 
SEQ. ID NO. 41 is the DNA sequence of TbM-1 4. 
SEQ. ID NO. 42 is the DNA sequence of TbM-15. 
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SEQ. ID NO. 43 is the DNA sequence of TbH-4. 
SEQ. ID NO. 44 is the DNA sequence of TbH-4-FWD. 
SEQ. ID NO. 45 is the DNA sequence of TbH-12. 
SEQ. ID NO. 46 is the DNA sequence of Tb38-1. 
SEQ. ID NO. 47 is the DNA sequence of Tb38-4. 
SEQ. ID NO. 48 is the DNA sequence of TbL-17. 
SEQ. ID NO. 49 is the DNA sequence of TbL-20. 
SEQ. ID NO. 50 is the DNA sequence of TbL-21. 
SEQ. ID NO. 51 is the DNA sequence of TbH-16. 
SEQ. ID NO. 52 is the DNA sequence of DPEP. 
SEQ. ID NO. 53 is the deduced amino acid sequence of DPEP. 
SEQ. ID NO. 54 is the protein sequence of DPV N-terminal Antigen. 
SEQ. ID NO. 55 is the protein sequence of AVGS N-terminal Antigen. 
SEQ. ID NO. 56 is the protein sequence of AAMK N-terminal Antigen. 
SEQ. ID NO. 57 is the protein sequence of YYWC N-terminal Antigen 
SEQ. ID NO. 58 is the protein sequence of DIGS N-terminal Antigen. 
SEQ. ID NO. 59 is the protein sequence of AEES N-terminal Antigen. 
SEQ. ID NO. 60 is the protein sequence of DPEP N-terminal .Antigen. 
SEQ. ID NO. 61 is the protein sequence of APKT N-terminal Antigen. 
SEQ. ID NO. 62 is the protein sequence of DPAS N-termmal Antigen. 
SEQ. ID NO. 63 is the deduced ammo acid sequence of TbRal. 
SEQ. ID NO. 64 is the deduced amino acid sequence of TbRalO. 
SEQ. ID NO. 65 is the deduced amino acid sequence of TbRal 1. 
SEQ. ID NO. 66 is the deduced amino acid sequence of TbRal2. 
SEQ. ID NO. 67 is the deduced amino acid sequence of TbRal3. 
SEQ. ID NO. 68 is the deduced amino acid sequence of TbRal6. 
SEQ. ID NO. 69 is the deduced amino acid sequence of TbRal 7. 
SEQ. ID NO. 70 IS the deduced 'ammo acid sequence of TbRalS. 
SEQ. ID NO. 71 is the deduced amino acid sequence of TbRal9. 
SEQ. ID NO. 72 is the deduced amino acid sequence of TbRa24. 
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SEQ. ID NO. 73 is the deduced amino acid sequence of TbRa26. 
SEQ. ID NO. 74 is the deduced amino acid sequence of TbRa28. 
SEQ. ID NO. 75 is the deduced amino acid sequence of TbRa29. 
SEQ. ID NO. 76 is the deduced amino acid sequence of TbRa2A. 
SEQ. ID NO. 77 is the deduced amino acid sequence of TbRa3. 
SEQ. ID NO. 78 is the deduced amino acid sequence of TbRa32. 
SEQ. ID NO. 79 is the deduced amino acid sequence of TbRa35. 
SEQ. ID NO. 80 is the deduced amino acid sequence of TbRa36. 
SEQ. ID NO. 81 is the deduced amino acid sequence of TbRa4. 
SEQ. ID NO. 82 is the deduced amino acid sequence of TbRa9. 
SEQ. ID NO. 83 is the deduced amino acid sequence of TbRaB. 
SEQ. ID NO. 84 is the deduced amino acid sequence of TbRaC. 
SEQ. ID NO. 85 is the deduced amino acid sequence of TbRaD. 
SEQ. ID NO. 86 is the deduced amino acid sequence of YYWCPG. 
SEQ. ID NO. 87 is the deduced amino acid sequence of TbAAMK. 
SEQ. ID NO. 88 is the deduced amino acid sequence of Tb38-1. 
SEQ. ID NO. 89 is the deduced amino acid sequence of TbH-4. 
SEQ. ID NO. 90 is the deduced amino acid sequence of TbH-8. 
SEQ. ID NO. 91 is the deduced ammo acid sequence of TbH-9. 
SEQ. ID NO. 92 is the deduced amino acid sequence of TbH-12. 
SEQ. ID NO. 93 is the amino acid sequence of Tb38-1 Peptide 1 . 
SEQ. ID NO. 94 , s the ammo acid sequence of Tb38-1 Peptide 2. 
SEQ. ID NO. 95 is the amino acid sequence of Tb38-! Peptide 3. 
SEQ. ID NO. 96 is the amino acid sequence of Tb38-1 Peptide 4. 
SEQ. ID NO. 97 is the amino acid sequence of Tb38-1 Peptide 5. 
SEQ. ID NO. 98 is the amino acid sequence of Tb38-1 Peptide 6. 
SEQ. ID NO. 99 is the DNA sequence of DPAS. 
SEQ. ID NO. 100 is the deduced amino acid sequence of DPAS. 
SEQ. ID NO. 101 is the DNA sequence of DPV. 
SEQ. ID NO. 1 02 is the deduced amino acid sequence of DPV. 
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SEQ. ID NO. 103 is the DNA sequence of ESAT-6. 

SEQ. ID NO. 104 is the deduced amino acid sequence of ESAT-6. 

SEQ. ID NO. 105 is the DNA sequence of TbH-8-2. 

SEQ. ID NO. 106 is the DNA sequence of TbH-9FL. 

SEQ. ID NO. 107 is the deduced amino acid sequence of TbH-9FL. 

SEQ. ID NO. 108 is the DNA sequence of TbH-9-1. 

SEQ. ID NO. 109 is the deduced amino acid sequence of TbH-9-1. 

SEQ. ID NO. 1 10 is the DNA sequence of TbH-9-4. 

SEQ. ID NO. 1 1 1 is the deduced amino acid sequence of TbH-9-4. 

SEQ. ID NO. 1 12 is the DNA sequence of Tb38-1F2 IN. 

SEQ. ID NO. 1 13 is the DNA sequence of Tb38-2F2 RP. 

SEQ. ID NO. 1 14 is the deduced amino acid sequence of Tb37-FL. 

SEQ. ID NO. 1 15 is the deduced amino acid sequence of Tb38-IN. 

SEQ. ID NO. 1 16 is the DNA sequence of Tb38-1F3. 

SEQ. ID NO. 1 17 is the deduced amino acid sequence of Tb38-1F3. 

SEQ. ID NO. 118 is the DNA sequence of Tb38-1F5. 

SEQ. ID NO. 1 19 is the DNA sequence of Tb38-1F6. 

SEQ. ID NO. 120 is the deduced N-terminal amino acid sequence of DPV. 

SEQ. ID NO. 12 1 is the deduced N-terminal amino acid sequence of AVGS. 

SEQ. ID NO. 122 is the deduced N-terminal amino acid sequence of AAMK. 
SEQ. ID NO. 123 is the deduced N-terminal amino acid sequence of YYWC. 
SEQ. ID NO. 124 is the deduced N-terminal amino acid sequence of DIGS. 
SEQ. ID NO. 125 is the deduced N-terminal amino acid sequence of AEES. 
SEQ. ID NO. 1 26 is the deduced N-terminal amino acid sequence of DPEP. 
SEQ. ID NO. 127 is the deduced N-terminal amino acid sequence of APKT. 
SEQ. ID NO. 128 is the deduced amino acid sequence of DP AS. 
SEQ. ID NO. 129 is the protein sequence of DPPD N-tenninal Antigen. 
SEQ ID NO. 130-133 are the protein sequences of four DPPD cyanogen 
bromide fragments. 

SEQ ID NO. 134 is the N-terminal protein sequence of XDS antigen. 
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SEQ ID NO. 135 is the N-terminal protein sequence of AGD antigen. 
SEQ ID NO. 136 is the N-terminal protein sequence of APE antigen. 
SEQ ID NO. 137 is the N-terminal protein sequence of XYI antigen. 
SEQ ID NO. 138 is the DNA sequence of TbH-29. 
5 SEQ ID NO. 139 is the DNA sequence of TbH-30. 

SEQ ID NO. 140 is the DNA sequence of TbH-32. 
SEQ ID NO. 141 is the DNA sequence of TbH-33. 
SEQ ID NO. 142 is the predicted amino acid sequence of TbH-29. 
SEQ ID NO. 143 is the predicted amino acid sequence of TbH-30. 
'0 SEQ ID NO. 144 is the predicted amino acid sequence of TbH-32. 

SEQ ID NO. 145 is the predicted amino acid sequence of TbH-33. 

SEQ ID NO: ,46-151 are PGR primers used in the preparation of a fusion 

protein containing TbRaS, 38 kD and Tb38-1. 

SEQ ID NO: 152 is the DNA sequence of the fusion protein containing TbRa3 
15 38kDandTb38-l. 

SEQ ID NO: 153 is the amino acid sequence of the fusion protein containing 
TbRa3.38kDandTb38-l. 

SEQ ID NO: 154 is the DNA sequence of the M. tuberculosis antigen 38 kD. 

SEQ ID NO: 155 is the amino acid sequence of the M tuberculosis antigen 38 
20 kD. 

SEQ ID NO: 156 is the DNA sequence of XP14. 
SEQ ID NO: 157 is the DNA sequence of XP24. 
SEQ ID NO: 1 58 is the DNA sequence of XP31. 
SEQ ID NO: 159 is the 5' DNA sequence of XP32. 
SEQ ID NO: 160 is the 3' DNA sequence of XP32. 
SEQ ID NO: 161 is the predicted amino acid sequence of XP 1 4. 
SEQ ID NO: 162 is the predicted ammo acid sequence encoded by the reverse 
complement of XP 14. 

SEQ ID NO: 163 is the DNA sequence of XP27. 
SEQ ID NO: 164 is the DNA sequence of XP36. 



25 
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SEQ ID NO: 165 is the 5' DNA sequence of XP4. 
SEQ ID NO: 166 is the 5' DNA sequence of XP5 
SEQ ID NO: 167 is the 5' DNA sequence of XPI7 
SEQ ID NO: 1 68 is the 5' DNA sequence of XP30 
SEQ ID NO: 169 is the 5' DNA sequence of XP? 
SEQ ID NO: 170 is the 3' DNA sequence of XP2 
SEQ ID NO: 1 71 is the 5' DNA sequence of XP3 
SEQ ID NO: 172 is the 3' DNA sequence of XP3 
SEQ ID NO: 1 73 is the 5' DNA sequence of XP6 
SEQIDNO: 1 74 is the 3 ' DNA sequence of XP6 
SEQ ID NO: 175 is the 5' DNA sequence of XPI8 
SEQIDNO: 176 is the 3' DNA sequence ofXP!8 
SEQ ID NO: 1 77 i s the 5' DNA sequence of XP19 
SEQ ID NO: 178 is the 3' DNA sequence of XP19 
SEQ ID NO: 1 79 is the 5' DNA sequence of XP22 
SEQIDNO: ISO is the 3' DNA sequence of XP22 
SEQ ID NO: 1 8 1 is the 5" DNA sequence of XP25 
SEQ ID NO: 182 is the 3' DNA sequence of XP25 
SEQ ID NO: 183 1S me m ^ QNA of 

SEQ ID NO: 1 84 1S the predicted ammo acid sequence of TbH4-XPl 

SEQ ID NO: ,85 IS the predicted ammo acid sequence encode, by the reverse 

complement of TbH4-XP 1 

EO 1 2 15 " " C " d " -*» «" ~ «~ * XP36. 
m ' 88 * "* ~ -*» encode, by ^ rcvOTe 

complement of XP36. 

SEQ ID NO: 1 89 is the DNA sequence of RDIF2. 

190 is the DNA sequence of RDIF5. 

191 is the DNA sequence of RDIF8. 

192 is the DNA sequence of RDIF10. 



SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
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SEQ ID NO: 193 is the DNA sequence of RDIF1 1. 
SEQ ID NO: 194 is the predicted amino acid sequence of RDIF2 
SEQ ID NO: 195 is the predicted amino acid sequence of RDIF5 
SEQ ID NO: 196 is the predicted amino acid sequence of RDIF8 
SEQ ID NO: 197 is the predicted amino acid sequence of RDIF10 
SEQ ID NO: 198 is the predicted amino acid sequence of RDIF1 1 
SEQ ID NO: 199 is the 5' DNA sequence of RDIF12. 
SEQ ID NO: 200 is the 3' DNA sequence of RDIF12. 
SEQ ID NO: 201 is the DNA sequence of RDIF7. 
SEQ ID NO: 202 is the predicted amino acid sequence of RDIF7 
SEQ ID NO: 203 is the DNA sequence of DIF2-1. 
SEQ ID NO: 204 is the predicted amino acid sequence of DIF2-1 
SEQ ID NO: 205-212 are PGR primers used in the preparation of a ft*. 
contaming ThRa3. 38 *D, Tb38-1 and DPEP (hereinafter referred to as 

SEQ ID NO: 213 is the DNA sequence of the fusion protein TbF-~> 

SEQ ID NO: 214 is the amino acid sequence of the fusion protein TbF-? 

SEQ ID NO: 215 is the 5' DNA sequence of MO-1. 

SEQ ID NO: 216 is the 5' DNA sequence for MO-2 

SEQ ID NO: 217 is the 5' DNA sequence for MO-4. 

SEQ ID NO: 218 is the 5" DNA sequence for MO-8. 

SEQ ID NO: 219 is the 5' DNA sequence for MO-9 

SEQ ID NO: 220 is the 5' DNA sequence for MO-26 

SEQ ID NO: 221 is the 5' DNA sequence for MO-28. 

SEQ ID NO: 222 is the 5' DNA sequence for MO-29 

SEQ ID NO: 223 is the 5' DNA sequence for MO-30. 

SEQ ID NO: 224 is the 5' DNA sequence for MO-34. 

SEQ ID NO: 225 is the 5" DNA sequence for MO-35. 

SEQ ID NO: 226 is the predicted amino acid sequence for MO- 1 . 

SEQ ID NO: 227 is the predated amino acid sequence for MO-2. 
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SEQ ID NO: 228 is the predicted amino add sequence for MO-4 
SEQ ID NO: 229 is the predicted amino acid sequence for MO-8 
SEQ ID NO: 230 is the predicted amino acid sequence for MO-9 
SEQ ID NO: 23 1 is the predicted amino acid sequence for MO-, 6 
SEQ ID NO: 232 is the predicted amino acid sequence for MO-28 
SEQ ID NO: 233 is the predicted amino acid sequence for MO-29 
SEQ ID NO: 234 is the predicted amino acid sequence for MO-30 
SEQ ID NO: 235 is the predicted amino acid sequence for MO-34 
SEQ ID NO: 236 is the predicted amino acid sequence for MO-35 
SEQ ID NO: 237 is the determined DNA sequence for MO-10 
SEQ ID NO: 238 is the predicted amino acid sequence for MO-10 
SEQ ID NO: 239 is the 3' DNA sequence for MO-27 
SEQ ID NO: 240 is the foil-length DNA sequence for DPPD 
SEQ ID NO: 24 1 is the predicted full-length anino acid sequence for DPPD 
SEQ ID NO: 242 is the detenmned 5' cDNA sequence for LSER-10 
SEQ ID NO: 243 is the determined 5' cDNA sequence for LSER-1 1 
SEQ ID NO: 244 is the determined 5' cDNA sequence for LSER-P 
SEQ ID NO: 245 is the deterrmned y cDNA sequence for LSER-13 
SEQ ID NO: 246 is the detenmned 5" cDNA sequence for LSER-,6 
SEQ ID NO: 247 is the deterrmned 5' cDNA sequence for LSER-5 
SEQ ID NO: 248 is the predicted ammo acid sequence for LSER-10 
SEQ ID NO: 249 is the predicted ammo acid sequence for LSER-P 
SEQ ID NO: 250 is the predicted ammo acid sequence for LSER-,3 
SEQ ID NO: 251 is the predicted amino acid sequence for LSER-1 6 
SEQ ID NO: 252 ,s the predicted ammo acid sequence for LSER-^5 
SEQIDNO:253 is the determined cDNA sequence for LSER-1 8 
SEQ ID NO: 254 ls the deterrmned cDNA sequence for LSER-^3 
SEQ ID NO: 255 is the determmed cDNA sequence for LSER-4 
SEQ ID NO: 256 is the deterrmned cDNA sequence for LSER-77 
SEQ ID NO: 257 is the predicted ammo acid sequence for LSER-18 
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SEQ ID NO: 258 is the prediced ^ ^ fc ^ ^ 

SEQ ID NO: 259 is te p^ ^ ^ for ^ 

SEQ ID NO: 260 is prcdiced amino acid sequence for LSER-^7 
SEQ ID NO: 26,is ^ y ^ fc ' 

SEQ ID NO: 262 is fte denned 5- cDNA science for LSER-3 
EQ ID NO: 263 is *e derernnned r cDNA sconce for LSER-4 
SEQ ID NO: 264 is „e de^mined 5- cDNA sequence for LSER-5 
SEQ m NO: 265 is *. deremuned cDNA sequence for LSER-6 
SEQ dettmmKd 5 , cDNA ^ ^ 

SPn ,n M7 " 5 ' CDNA ~ «* "ER-,4 

SEQ m NO: 268 is d* dcerrmned ,• cDNA sequence for LS E R,5 
EQ D NO: 26, is ine dcermined 5' cDNA sequence for LSER-1 7 
SEQ DNO: 270 is me dcermined cDNA sequence forLSER-,9 
SEQ DNO: 27, ,s medcennined 5- cDNA sequence forLSER-20 
SEQ ID NO: 272 is me dcermined 5' cDNA sequence for LSER- 
SEQ ID NO: 273 is me derermined 5' cDNA sequence for LSER-26 
SEQ ID NO: 274 is u,e dcermined 5' cDNA sequence for LSER-28 
SEQ ID NO: 275 is me dcermined 5' cDNA sequence for LSER-,9 
EQ ID NO: 276 ,s me dcermined 5' cDNA sequence for LSER-30 
SEQ ID NO: 277 is ure predict amino acd sequence for LSER-, 
SEQ ID NO: 278 is me prediced amino ac id sequent for LSER-3 
SEQ ID NO: 279 is me priced ammo acid sequence for LSER-5 
EQ ,D NO: 280 is d* prediced amino ac,d sequence tor LSER-6 
SEQ ID NO: 28 ■ ,s me voiced a™ acid sequence for LSER-8 
SEQ ID NO: 282 is the predict am, no Kid ^ ^ ^ 

EQ ID NO: 283 is the priced amino ac,d sequence for LSER-1 5 
SEQ ID NO: 284 is tbe predi „e d ^ ^ for 

SEQ !D NO: 285 is me prediced ammo acid sequence for LSER-,9 
SEQ ,D NO: 286 is me prediced ammo acid sequence for LSER-^0 
° ' D N ° : 287 is "* "'" d ^ <*™<> acid sequence for LSER-- 
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SEQ ID NO: 288 is the predicted amino acid sequence for LSER-26 

SEQ ID NO: 289 is the predicted amino acid sequence for LSER-28 

SEQ ID NO: 290 is the predicted amino acid sequence for LSER-29 

SEQ ID NO: 291 is the predicted amino acid sequence for LSER-30 

SEQ ID NO: 292 is the determined cDNA sequence for LSER-9 

SEQ ID NO: 293 is the determined cDNA sequence for the .verse complement 

ofLSER-6 

SEQ ID NO: 294 is the predicted amino acid sequence for the reverse 
complement of LSER-6 

SEQ ID NO: 295 is the determined 5" cDNA sequence for MO-12 
SEQ ID NO: 296 is the determined 5' cDNA sequence for MO-13 
SEQ ID NO: 297 is the determined 5' cDNA sequence for MO-19 
SEQ ID NO: 298 is the determined 5' cDNA sequence for MO-39 
SEQ ID NO: 299 is the predicted amino acid sequence for MO-12 
SEQ ID NO: 300 is the predicted amino acid sequence for MO-1 3 
SEQ ID NO: 301 is the predicted amino acid sequence for MO-1 9 
SEQ ID NO: 302 is the predicted amino acid sequence for MO-39 
SEQ ID NO: 303 ls the determmed 5' cDNA sequence for Erdsn-1 
SEQ ID NO: 304 is the determined 5" cDNA sequence for Erdsn-2 
SEQ ID NO: 305 is the determmed 5' cDNA sequence for Erdsn-4 
SEQ ID NO: 306 is the determined 5' cDNA sequence for Erdsn-5 
SEQ ID NO: 307 is the determmed 5' cDNA sequence for Erdsn-6 
SEQ ID NO: 308 is the determined 5' cDNA sequence for Erdsn-7 
SEQ ID NO: 309 is the determined 5' cDNA sequence for Erdsn-8 
SEQ ID NO: 310 is the determmed 5' cDNA sequence for Erdsn-9 
SEQ ID NO: 3 1 1 is the determined 5' cDNA sequence for Erdsn- 1 0 
SEQ ID NO: 312 is the determined 5' cDNA sequence for Erdsn-12 
SEQ ID NO: 313 is the determined 5' cDNA sequence for Erdsn-13 
SEQ ID NO: 314 is the determined 5' cDNA sequence for Erdsn-14 
SEQ ID NO: 315 is the determined 5' cDNA sequence for Erdsn-15 



WO 99/42076 



PCT/US99/03268 



18 



10 



15 



20 



30 



SEQ ID NO: 316 is the determined 5' cDNA sequence for Erdsn-16 

SEQ ID NO: 317 is the determined 5' cDNA sequence for Erdsn-1 7 

SEQ ID NO: 318 is the determined 5' cDNA sequence for Erdsn- 1 8 

SEQ ID NO: 319 is the determined 5' cDNA sequence for Erdsn-21 

SEQ ID NO: 320 is the determined 5' cDNA sequence for Erdsn-22 

SEQ ID NO: 321 is the determined 5' cDNA sequence for Erdsn-23 

SEQ ID NO: 322 is the determined 5' cDNA sequence for Erdsn-25 

SEQ ID NO: 323 is the determined 3' cDNA sequence for Erdsn-1 

SEQ ID NO: 324 is the determined 3' cDNA sequence for Erdsn-2 

SEQ ID NO: 325 is the determined 3' cDNA sequence for Erdsn-4 

SEQ ID NO: 326 is the determined 3' cDNA sequence for Erdsn-5 

SEQ ID NO: 327 is the determined 3' cDNA sequence for Erdsn-7 

SEQ ID NO: 328 is the determined 3' cDNA sequence for Erdsn-8 

SEQ ID NO: 329 is the determined 3' cDNA sequence for Erdsn-9 

SEQ ID NO: 330 is the determined 3' cDNA sequence for Erdsn-10 

SEQ ID NO: 331 is the determined 3' cDNA sequence for Erdsn-12 

SEQ ID NO: 332 is the determined 3' cDNA sequence for Erdsn- 13 

SEQ ID NO: 333 is the determined 3' cDNA sequence for Erdsn- 14 

SEQ ID NO: 334 is the determined 3' cDNA sequence for Erdsn- 15 

SEQ ID NO: 335 is the determined 3' cDNA sequence for Erdsn-16 

SEQ ID NO: 336 is the determined 3' cDNA sequence for Erdsn- 17 

SEQ ID NO: 337 is the determined 3' cDNA sequence for Erdsn- 18 

SEQ ID NO: 338 is the determined 3' cDNA sequence for Erdsn-21 

SEQ ID NO: 339 is the determined 3' cDNA sequence for Erdsn-22 

SEQ ID NO: 340 is the determined 3' cDNA sequence for Erdsn-23 

SEQ ID NO: 341 is the determined 3' cDNA sequence for Erdsn-25 

SEQ ID NO: 342 is the determined cDNA sequence for Erdsn-24 

SEQ ID NO: 343 is the determined amino acid sequence for a M. tuberculosis 

85b precursor homolog 

SEQ ID NO: 344 is the determined amino acid sequence for spot 1 
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SEQ ID NO: 345 is a determined amino acid sequence for spot 2 
SEQ ID NO: 346 is a determined amino acid sequence for spot 2 
SEQ ID NO: 347 is the determined amino acid seq for spot 4 
SEQ ID NO: 348 is the sequence of primer PDM-157 
5 SEQ ID NO: 349 is the sequence of primer PDM-160 

SEQ ID NO: 350 is the DNA sequence of the fusion protein TbF-6 
SEQ ID NO: 351 is the amino acid sequence of fusion protein TbF-6 
SEQ ID NO: 352 is the sequence of primer PDM-176 
SEQ ID NO: 353 is the sequence of primer PDM-175 
> SEQ ID NO: 354 is the DNA sequence of the fusion protein TbF-8 

SEQ ID NO: 355 is the amino acid sequence of the fusion protein TbF-8 

DETAILED DESCRIPTION OF THE INVENTION 

As noted above, the present invention is generally directed to 
compositions and methods for preventing, treating and diagnosing tuberculosis The 
compositions of the subject invention include polypeptides that comprise at least one 
unmunogemc portion of a M. tuberculosis antigen, or a variant of such an antigen that 
dtffers only in conservative substitutions and/or modifications. Polypeptides widun the 
scope of the present invention include, but are not limited to. immunogemc soluble 
M. tuberculosis antigens. A "soluble M. tuberculosis antigen" is a protem of 
M. tuberculosts ongm that ,s present in A, tuberculosis culture filtrate. As used herem 
the term "polypeptide" encompasses ammo acid chams of any .ength. i„ cludlng mI1 
length protems (,,, antigens), wherem the amino acid residues are linked by covalent 
pepude bonds. Thus, a polypeptide comprismg an .mmunogenic portion of one of the 
above antigens may constst entirely of the unmunogemc portion, or may contam 
addmona, sequences. T* e additional sequences may be derived from the native 
M. tuberculosis antigen or may be heterologous, and such sequences may (but need not) 
be immunogenic. 

■•tammogenic.. * ^ hrain . a!m „ ^ ^ a ^ ^ 

«P— ceUuiar, ■„ a padcm. such as a huroan. and/or in a biologica! sample ,n 
pamcuiar. antigens ^ are lmmunog ^ c (nd , mmilnogcmc ^ „ ^ ^ 
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of such antigens) „ ^ of sdmulaimg ^ 

least an immunogenic portion of one or more M. tuberculosis antigens may generally be 
- » ^ec, tuberettiosU or to induce „ c ^ ^ ^ £ 

patient 

The compositions and methods of the present invention ^ „ 

-"-^eahovepolypepddesandO^^. A polypeptide "variant, ^s 

used herem, ,s a peptide m dife ^ ^ ^ 

—e ^ modifaionA mh ^ ^ - 

~ of me peptide are retained. Polypeptide ,1 
» exfcb.t « .east about 70,,, more preferably a, ieast about ,0% and m„s, 
prefe.bly a, ,east about ,5,. identity to the identic polypeptides. Per polypeptj 
« «h _tive propenies. variants ma, alternatively, he tdentined bv 1^ 
** ammo acid seouence of one of the above poiypeptides, and evaluating M 

"active ° f — ^r poiypeptides usefc, J me 

g_ „ f ^ ^ ^ , ^ ^ ^ ^ ^ 

tnodtiied polypeptide for the ability ,„ g e„ OTK antibod,es ,ha, deKc, me presence 0 r 
30 absence o tube reul „s, Such modifted seouences may be prepared and ,L Mln 
for example, the representative procedures described herein. 

As used herem. a "conservative substimtion" is one in which an amino 
ac ,s statute, for another ammo acid tha, has simiW propel such tha, one 
* led u, the an of peptide chemtstry would expect me secondary suture and 
" ~ — ° f ^-'^^—y unchanged. * ^ 
***** groups of amino acids represent conservative changes, ,„ ala. pra , gly glu 
asp, gin, asn. ser, thr; (2) cys ser tvr rhr- i ■. ■ ^ ' » IP". 

and (V nh. «, } ' * ' eU - mCt ' ^ PhC; W «* his; 

ana (5) phe, tyr, trp, his. 

Varianrs may Qr ^ ^ m 

- tnCudmg me deietion or addinon of am.no acids te have minima, M „e„ce on the 
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-to* propel, ^ ^ 

example a Dolvrv»m,vu L Polypeptide. For 

— .iiiT.r r" y ° r ~ — * — 

pol Z " e ; f " a-B-i- of U,. po,^ (eg 

PO y-as, or ,o ^ binding of fc ^ ^ ^ a » 

Polypep-de »" "» » a, lm ^ 0 Min Fc ^ ^ 3 

■» addons sll L ' " **"»• »*— ~ »r 

^ ,r:: t mA - m °- ™> n ~ — - - 

followed bv two washes of • overnight: 
. o washes of ^0 minutes each in IX SSC 0 1% SDS ir « °r ^ 
-0 washes nf ~- Ub Jt 6:5 c and two 

washes ot ,0 mmmes each in 0.2X SSC. 0.1% SDS at 65 °C. 

In a related aspect, combination polypeptides are disclosed A 
combination polypeptide" i<= . ajsciosed. A 

poir v ^ compnsmg * ieast ° ne ° f ^ 

* -r or . ^ ,j;,r 0 : - may - 

diminish the ™™ y) 1 does not s »gnificantlv 

mmish the myogenic properties of the component polypeptides. 

In general, M. tuberculosis antigens and DMA e 
^en, may be prepared using any of a J^^TT T" ^ 
30 antigens mav be isolated from Vf nd, / ■ ^ S0,Ub,e 

from M. tubercular culture filtrate by procedures known to 
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those of ordinary skill in the art, including anion-exchange and reverse phase 
chromatography. Purified antigens are then evaluated for their ability to elicit an 
appropriate immune response (e.g., cellular) using, for example, the representative 
methods described herein. Immunogenic antigens may then be partially sequenced 
5 using techniques such as traditional Edman chemistry. See Edman and Berg, Eur. J. 
Biochem. 80: 1 1 6- 1 32, 1 967. 

Immunogenic antigens may also be produced recombinant^ using a 
DNA sequence that encodes the antigen, which has been inserted into an expression 
vector and expressed in an appropriate host. DNA molecules encoding soluble antigens 
10 may be isolated by screening an appropriate M. tuberculosis expression library with 
anti-sera (e.g., rabbit) raised specifically against soluble M. tuberculosis antigens. DNA 
sequences encoding antigens that may or may not be soluble may be identified by 
screening an appropriate M. tuberculosis genomic or cDNA expression library with sera 
obtained from patients infected with M. tuberculosis. Such screens may generally be 
15 performed using techniques well known to those of ordinary skill in the art. such as 
those described in Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold 
Spring Harbor Laboratories. Cold Spring Harbor. NY. 1989. 

DNA sequences encoding soluble antigens may also be obtained by 
screening an appropriate M. tuberculosis cDNA or genomic DNA librarv for DNA 
ffl sequences that hybridize to degenerate oligonucleotides derived from partial amino acid 
sequences of isolated soluble antigens. Degenerate oligonucleotide sequences for use in 
such a screen may be de S1 gned and synthesized, and the screen may be performed, as 
described (for example) in Sambrook et al.. Molecular Cloning: A Laboratory Manual. 
Cold Spring Harbor Laboratories. Cold Spring Harbor. NY. 1989 (and references cited 
5 therein). Polymerase chain reaction (PCR) may also be employed, using the above 
oligonucleotides in methods well known in the art, to isolate a nucleic acid probe from a 
cDNA or genomic library. The library screen may then be performed using the related 
probe. 

AJtematively, genomic or cDNA libraries derived from M. tuberculosis 
) may be screened directly usmg peripheral blood mononuclear cells (PBMCs) or T cell 



WO 99/42076 



PCT/US99/03268 



Imes or clones derived from one or more M. tuberculosis-immun* individuals In 
general, PBMCs and/or T cells for use in such screens may be prepared as described 
below. Dtrect , ibrary screens ^ genera]ly ^ performed ^ ^ ^ 

expressed recombinant proteins for the ability to induce proliferation and/or interferon-y 
5 production in T cells derived from an M. tuberculosis-immune individual 
Alternatively, potential T cell antigens may be first selected based on antibody 
reactivity, as described above. 

Regardless of the method of preparation, the antigens (and immunogenic 
pomons thereof) described herein (which may or may not be soluble) have the abilitv to 
.0 mduce an immunogenic response. More specifically, the antigens have the abUitv to 
xnduce proliferation and/or cytokine production (/.,, interferon-y and/or interieuldn-12 
production) in T cells, NK cells, B cells and/or macrophages denved from an 
M. tuberculosis-^ individual. The selection of cell type for use in evaluating an 
immunogenic response to a antigen will, of course, depend on the desired response. For 
.5 example, interieukin-12 production is most readily evaluated using preparations 
cooammg B cells and/or macrophages. An M tuberculosis-imm^ individual is one 
who u considered to be re S1S tant to the development of tuberculosis bv virtue of having 
mounted an effective T ceil response to M. tuberculosis (i.e., substantially free of 
dxsease symptoms). Such individuals may be identified based on a strong posttive 
20 (,,. greater than about 10 mm diameter induration) intradermal skm test response to 
tuberculosa protems (PPD) and an absence of any S1 gns or symptoms of tuberculosa 
d»sease. T cells. NK cells. B ceHs and macrophages derived from M. lU bercuiosis- 
unmune mdividuals may be prepared usmg methods known to those of ordinarv skill in 
the an. For example, a preparation of PBMCs (,".,. peripheral blood mononuclear cells) 
25 may be employed without further separation of component cells. PBMCs may 
generally be prepared, for example, using density centrifugation through Ficoll™ 
(Wmthrop Laboratory NY). T cells for use in the assays described herein mav also be 
punfled directly from PBMCs. Alternatively, an enriched T cell line reactive against 
mycobacterial proteins, or T cell clones reactive to individual mycobacterial proteins 
30 may be employed. Such T cel. Cones may be generated by, for examp,e. cultunns 
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PBMCs from M. tuberculosis-immune individuals with mycobacterial proteins for a 
period of 2-4 weeks. This allows expansion of only the mycobacterial protein-specific 
T cells, resulting in a line composed solely of such cells. These cells may then be 
cloned and tested with individual proteins, using methods known to those of ordinary 
skill in the art, to more accurately define individual T cell specificity. In general, 
antigens that test positive in assays for proliferation and/or cytokine production (Le., 
interferon-y and/or interleukin-12 production) performed using T cells, NK cells. B cells 
and/or macrophages derived from an M. tuberculosis-mamms individual are considered 
immunogenic. Such assays may be performed, for example, using the representative 
procedures described below. Immunogenic portions of such antigens may be identified 
using similar assays, and may be present within the polypeptides described herein. 

The ability of a polypeptide {e.g., an immunogenic antigen, or a portion 
or other variant thereof) to induce cell proliferation is evaluated by contacting the cells 
{e.g., T cells and/or NK cells) with the polypeptide and measuring the proliferation of 
15 the cells. In general, the amount of polypeptide that is sufficient for evaluation of about 
10 J cells ranges from about 10 ng/mL to about 100 ag/mL and preferably is about 10 u 
g/mL. The incubation of polypeptide with cells is typically performed at 37°C for about 
six days. Following incubation with polypeptide, the cells are assayed for a 
proliferative response, which may be evaluated by methods known to those of ordinary 
skill m the art. such as exposing cells to a pulse of radiolabeled thymidine and 
measuring the incorporation of label into cellular DNA. In general, a polypeptide that 
results in at least a three fold increase in proliferation above background (i.e.. the 
proliferation observed for cells cultured without polypeptide) is considered to be able to 
induce proliferation. 

25 71,6 abilit y of a Polypeptide to stimulate the production of interferon-y 

and/or interleukin-12 in cells may be evaluated by contacting the cells with the 
polypeptide and measuring the level of interferon-y or interleukin-12 produced by the 
cells. In general, the amount of polypeptide that is sufficient for the evaluation of about 
10' cells ranges from about 10 ng/mL to about 100 ug/mL and preferably is about 10 u 

30 g/mL. The polypeptide may. but need not. be immobilized on a solid suppon. such as a 
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bead or a biodegradable microsphere, such as those described in U.S. Patent 
Nos. 4,897,268 and 5,075,109. The incubation of polypeptide with the cells is typify 
performed at 37°C for about six days. Following incubation with polypeptide, the cells 
are assayed for interferon-y and/or interleukin-12 (or one or more subunits thereof), 
which may be evaluated by methods known to those of ordinary skill in the art, such as' 
an enzyme-linked immunosorbent assay (ELISA) or, in the case of IL-12 P70 subunit, a 
bioassay such as an assay measuring proliferation of T cells. In general, a polypeptide 
that results in the production of at least 50 pg of interferon-y per mL of cultured 
supernatant (containing 10M0' T cells per mL) is considered able to stimulate the 
production of interferon-y. A polypeptide that stimulates the production of at least 
10 pg/mL of IL-12 P70 subunit. and/or at least 100 pg/mL of IL-12 P40 subunit, per 10 s 
macrophages or B cells (or per 3 x 10 s PBMC) is considered able to stimulate the 
production of IL-12. 

In general, immunogenic antigens are those antigens that stimulate 
proliferation and/or cytokine production («., interferon-y and/or interleukin-12 
production) in T cells, NK cells, B cells and/or macrophages derived from at least about 
25% of M. tuberculosis-immune individuals. Among these immunogenic antigens, 
polypeptides having superior therapeutic properties may be distinguished based on the 
magnitude of the responses in the above assays and based on the percentaee of 
indiv,duals for which a response is observed. In addition, antigens having superior 
therapeutic properties will not stimulate proliferation and/or cytokine production in 
vitro in cells derived from more than about 25% of individuals that are not 
M. tuberculosis-tmmwe, thereby eliminating responses that are not specifically due to 
M. tuberculosis-^^ ce ll s . Those antigens that induce a response in a high 
percentage of T cell, NK cell. B cell and/or macrophage preparations from 
M. tuberculosis-\mmmt individuals (with a low incidence of responses in cell 
preparations from other individuals) have superior therapeutic properties. 

Antigens with superior'therapeutic properties may also be identified 
based on their ability to diminish the severity of M. tuberculosis infection in 
experimental animals, when administered as a vaccine. Suitable vaccine preparations 
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for use on experimental animals are described in detail below. Efficacy may be 
determined based on the ability of the antigen to provide at least about a 50% reduction 
in bacterial numbers and/or at least about a 40% decrease in mortality following 
experimental infection. Suitable experimental animals include mice, guinea pigs and 
5 primates. 

Antigens having superior diagnostic properties may generally be 
identified based on the ability to elicit a response in an intradermal skin test performed 
on an individual with active tuberculosis, but not in a test performed on an individual 
who is not infected with M. tuberculosis. Skin tests may generally be performed as 
) described below, with a response of at least 5 mm induration considered positive. 

Immunogenic portions of the antigens described herein mav be prepared 
and identified using well known techniques, such as those summarized in Paul 
Fundamental Immunology, 3d ed.. Raven Press, 1993, pp. 243-247 and references cited 
therein. Such techniques include screening polypeptide portions of the native antigen 
for zmmunogemc properties. The representative proliferation and cvtokme production 
assays described herein may generally be employed in these screens. An immunogenic 
pomon of a polypeptide is a portion that within such representative assavs. generates 
an immune response (e.g., proliferation, interferon-? production and/or imerleukin-P 
production) that is substantially similar to that generated by the full lensth antigen. In 
other words, an immunogenic portion of an antigen may generate at least about ">0% 
and preferably about 100%. of the proliferation induced by the full length antigen in the 
model proliferate assay described herein. An immunogenic portion mav also or 
alternatively, stimulate the production of at least about 20%. and preferably about 
100K, of the interferons and/or interleukin-12 induced by the full length antigen in the 
model assay described herein. 

Ponions and other variants ofM. tuberculosis antigens may be generated 
by synthetic or recombinant means. Synthetic polypeptides having fewer than about 
100 amino acids, and generally fewer man about 50 amino acids, mav be generated 
using techniques well known to those of ordinary skill in the art. For example, such 
polypeptides may be synthes 1Z ed using any of the commercially availab.e solid-phase 
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■ecnnmues. such as the Merrifieid solid-phase synthesis method , ^ ^ ^ ^ 
sequentially added » a growing amino «a chain . &e M ^ ^ ^ ^ 
5*2149-2146. 1563. Equipmen, for au.omaMd synthesis „f polypeptides is 
conunercially available from ^fa such as Applied BioSysfcms. inc., Foster Chy 
5 CA. and may be operand according ,„ the manufacturer's insuucdoas. Variams of a 
native antigen may generally be prepared using sttndard mtnagenesis techniques such 
as ohgonucleoddesiirected si B -sp«iflc muttgenesis. Sections of the DNA sequence 
may also be removed using sandard iecnniques to permit preparation of truncated 
polypeptides. 

Recombinant polypeptides conutining portions and/or vanants of a 
nattve antigen may be readily prepared from a DNA sequence encoding the polypept.de 
nstng a variety „f technic we „ „ ^ q{ ^ ^ ^ ^ ^ 

example, supematams from suitable hostfvector systems which secrete recombinant 
prcem mto culture tnedia may be firs, concen^ated using a commercially available 
■3 filter. Following concenti^tion. me concerns may be applied ,o a suiuble 
punfication matrix such as an affinity matrix or an ion exchange resin. FtaaUy one or 
more reverse phase HPLC steps can be employed to further purify a recombinant 
protein. 

Any of a variery of expression vectors known to those of ordinary skill in 
=0 the an may be employed to express recombmant polypeptides of this invention 
Expression may be achieved in any appropriate host cell that has been transformed or 
transfected with an expression vector contain^ a DNA molecule that encodes a 
recombmant polypeptide. Suitable host ceils include prokaryotes. veast and higher 
eukaryouc cell, Preferably, the host cells employed are E. coi, yeast or a mammalian 
25 cell line such as COS or CHO. ^ DNA sequences expressed m this manner mav 
encode naturally occurring antigens, portions of naturally occurring antigens, or other 
variants thereof. 

In general, regardless of the method of preparation, the polypeptides 
disclosed herein are prepared in substantially pure form. Preferably, the polypeptides 
30 are at least about ,0% pure, more preferably at least about 90% pure and most 
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5 



10 



15 



M at lea, about 99% ^ fc ^ ^ 

detal below, the substantially pure polypep.des are incepted into pharmaceutical 
composmons or vaccines for u* in one or more of the methods disclosed herein 

In certain specific embodiments, the subject invention discloses 
polypeptides comprising at least an portion rf , ^ M 

anugen having one of the following N-terminal sequences, or a variant thereof that 
differs only m conservative substitutions and/or modifications: 
Asp-Pro-Val-Asp-^ 
Gln-Val-Val-Ala-Ala-Leu; (SEQ ID No. 120) 

Ala-Val-Glu-Ser-GIy-Met-Leu-Ala-Leu-Glv-T^-Pro-Ala-Pro- 
Ser; (SEQ ID No. 121) 

Ala-Ala-Met-Lys-Pro-Arg-Thr-Gly-Asp-Gly-Pro-Leu-Glu-Ala- 
Ala-Lys-Glu-Gly-Arg; (SEQ ID No. 122) 

Tvr-Tyr-Trp-Cys-P^^ 
Pro; (SEQ ID No. 123) 

(e) Asr^Ile-Gly-Ser-GIu-Ser-Thr-Glu-Asp-Gln-Gln-Xaa-AIa-Val- 
(SEQ ID No. 124) 

Ala-Glu-Glu-Ser-Ile-Ser-Thr-Xaa-Glu-Xaa-Ile-Val-Pro; (SEQ ID 
No. 125) 



(a) 
(b) 
(c) 
(d) 



(f) 
(g) 
(h) 
(i) 

(j) 
(k) 



A S r>Pro-Glu-Pro-Ala-Pro-Pro.Val-Pro-Thr.Ala.Ala-Ala-Ser- 
Pro-Pro-Ser; (SEQ ID No. 126) 

Ala-Pro-Lvs-T^-Tyr-Xaa-Glu-GIu-Leu-Lys-Gly-TV-Asp-Thr- 
Gly; (SEQ ID No. 127) 

Asp-Pro-AJa-Ser-AIa-Pro-Asp-Val-Pro-Thr-AIa-AIa-Gln-Leu- 

Thr-Ser-Leu-Leu-Asn-Ser-Leu-Ala-Asp-Pro-Asn-Val-Ser-Phe- 
Ala-Asn; (SEQ ID No. 128) 

Xaa-Asp-Ser-Glu-Lys-Ser-AJa-Tr^-Ile-Lys.Val-Thr-Asp-Ala- 
Ser; (SEQ ID No. 134) 

Ala-Gly-Asp-T^-Xaa-Ile-Tyr-Ile-Val-Gly-Asn-Leu-Thr-Ala- 
Asp; (SEQ ID No. 135) or 
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(1) AJa-Pro-Glu-Ser-Gly-AJa-Gly-Leu-GIy-Gly-Thr-Val-Gln-Ala- 
Gly; (SEQ ID No. 136) 
wherein Xaa may be any amino acid preferably a cysteine residue. A DNA sequence 
encoding the antigen identified as (g) above is provided in SEQ ID No. 52, and the 
5 polypeptide encoded by SEQ ID No. 52 is provided in SEQ ID No. 53. A DNA 
sequence encoding the antigen defined as (a) above is provided in SEQ ID No. 101; its 
deduced amino acid sequence is provided in SEQ ID No. 102. A DNA sequence 
corresponding to antigen (d) above is provided in SEQ ID No. 24 a DNA sequence 
corresponding to antigen (c) is provided in SEQ ID No. 25 and a DNA sequence 
.0 corresponding to antigen (i) 1S provided in SEQ ID No. 99; its deduced amino acid 
sequence is provided in SEQ ID No. 100. 

In a further specific embodiment, the subject invention discloses 
polypeptides comprising at least an immunogenic portion of an M. tuberculosis antigen 
having one of the following N-terminal sequences, or a variant thereof that differs only 
15 in conservative substitutions and/or modifications: 

(m) Xaa-Tyr^le-AJa-Tyr-Xaa-Thr-Thr-Ala-Gly-Ile-Val-Pro-Gly-Lys- 
Ile-Asn-Val-His-Leu-Val; (SEQ ID No 137) or 

(n) Asp-Pro-Pro-Asp-Pro-His-Gln-Xaa-Asp-Met-Thr-Lys-Gly-Tyr- 
Tyr-Pro-Gly-Gly-Arg-Arg-Xaa-Phe; (SEQ ID No. 129) 
20 wherein Xaa may be any ammo acid, preferably a cysteine residue. 

In other specific embodiments, the subject invention discloses 
polypeptides comprising at least an immunogemc portion of a soluble M. tuberculosa 
antigen (or a variant of such an antigen) that comprises one or more of the amino acid 
sequences encoded by (a) the DNA sequences of SEQ ID Nos, 1, 2, 4-10. 13-25 and 
25 52; (b) the complements of such DNA sequences, or (c) DNA sequences substantially 
homologous to a sequence in (a) or (b). 

In further specific embodiments, the subject invention discloses 
polypeptides comprising at least an immunogenic portion of a M. tuberculosis antigen 
(or a variant of such an antigen), which may or may not be soluble, that comprises one 
30 or more of the amino acid sequences encoded by (a) the DNA sequences of SEQ ID 
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Nos.: 26-51, 138, 139, 163-183, 189-193, 199, 200, 201, 203, 215-225, 239, 240, 242- 
247, 253-256, 261-276, 292, 293, 295-298 and 303-342, (b)the complements of such 
DNA sequences or (c) DNA sequences substantially homologous to a sequence in (a) or 
<b). 

5 In the specific embodiments discussed above, the M. tuberculosis 

antigens include variants that are encoded by DNA sequences which are substantially 
homologous to one or more of DNA sequences specifically recited herein. "Substantial 
homology," as used herein, refers to DNA sequences that are capable of hybridizing 
under moderately stringent conditions. Suitable moderately stringent conditions include 

10 prewashing in a solution of 5X SSC, 0.5% SDS, 1.0 mM EDTA (pH 8.0); hybridizing 
at 50°C-65°C, 5X SSC, overnight or, in the case of cross-species homology at 45°C, 
0.5X SSC; followed by washing twice at 65°C for 20 minutes with each of 2X, 0.5X 
and 0.2X SSC containing 0.1% SDS). Such hybridizing DNA sequences are also 
within the scope of this invention, as are nucleotide sequences that, due to code 

15 degeneracy, encode an immunogenic polypeptide that is encoded by a hybridizing DNA 
sequence. 

In a related aspect, the present invention provides fusion proteins 
comprising a first and a second inventive polypeptide or, alternatively, a polypeptide of 
the present invention and a known M tuberculosis antigen, such as the 38 kD antigen 

20 described in Andersen and Hansen, Infect. Immun. 57:2481-2488. 1989, (Genbank 
Accession No. M30046) or ESAT-6 (SEQ ID Nos. 103 and 104), together with variants 
of such fusion proteins. The fusion proteins of the present invention may also include a 
linker peptide between the first and second polypeptides. 

A DNA sequence encoding a fusion protein of the present invention is 

25 constructed using known recombinant DNA techniques to assemble separate DNA 
sequences encoding the first and second polypeptides into an appropriate expression 
vector. The 3' end of a DNA sequence encoding the first polypeptide is ligated, with or 
without a peptide linker, to the 5' .-end of a DNA sequence encoding the second 
polypeptide so that the reading frames of the sequences are in phase to permit mRNA 
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Uanslanon of the M DNA tato a ^ ^ piottto ^ ^ 

btological activity of both fa seco „ d ^1^^ 

A peptide linte sequent may be employed to separate the fa and the 
*cond poly^ndes by a diaa.ee sufficient to ensure mat each polypeptide fo.ds into 

a "-o,x^ and ternary secures. Such a pep.de Unker seo.uence is mcorporated into 
*e taon protein usi» s s^ techniques „ e „ ^ 

tatar sequences may be chosen based on the following faoors: „) te abi.itv I0 
y a flexible exKnded conformation; (2) their inability to adopt a secondly suueture 
«— 1 on me fa and scco^ polypepudest and 

W d. lack of hydrophobic 0, charged residues mat might react with me polypeptide 
ftutcuona, epitopes. Prefer peptide lmker sequMKB ^ Qly ^ ^ ^ 
™dues. Omer near neutiai amino acids, such as Thr and Ala may also be used .„ the 
lmker sequence. Amino acid sequences which may be usefrlly em pl „yed as linkers 
,nc lude those disclosed m Man*ea e, a,., G» 40:^, „ 85; Murph, et al.. Pro, 

° \T Sc£ LSJ m258 " !26Z ,986; u s PaKni N °- - P— 

NO. 4,751.180. The Ifaker sequence may be from 1 ,0 about 50 amino acids m length 
Pepude sequences are no, required when ^ flrs , ^ ^ ^ ^ ^ ^ 
essentia, N-terminai amino acid regions tha, can be us ed to separaK te 
domains and prevent steric interference. 

The hgafcd DNA sequences are operably linked ,0 suitable 
—notional or uanslationa, regulatory element The regular Cements 
-ponstble for expression of DNA are located only 5' ,0 the DNA sequence encoding 
*■ Brs, polypeptides. Similariy. stop codons requtre ,0 end nansLnon a„ d 
aanscrtption termination signals are only present , , 0 me DNA sequence encoding the 
25 second polypeptide. 

In another aspect the present invention provides methods for using one 
or more of the above polypes or hsion proteins (or DNA molecules coding such 
polypeptides, ,0 induce protective unmuniry against tuberculosis m a patien, As used 
I*™, a "patient- refers ,0 any warm-blooded anima.. preferabiy a human, a patient 
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may be afflicted with a disease, or may be free of detectable disease and/or infection In 
other words, protective immunity may be induced to prevent or treat tuberculosis. 

In this aspect the polypeptide, fusion protein or DNA molecule is 
generally present within a pharmaceutical composition and/or a vaccine 
5 Pharmaceutical compositions may comprise one or more polypeptides, each of which 
may contain one or more of the above sequences (or variants thereof), and a 
phy S1 ologically acceptable carrier. Vaccines may comprise one or more of the above 
polypeptides and a non-specific immune response enhancer, such as an adjuvant or a 
hposome (into which the polypeptide is incorporated). Such pharmaceutical 
io compositions and vaccines may also contain other M. tuberculosa aniens either 
incorporated into a combination polypeptide or present within a separate polypeptide. 

Alternatively, a vaccine may contain DNA encoding one or more 
polypeptides as described above, such that the polypeptide is generated /„ situ. In such 
vaccines, the DNA may be present withm any of a variety of delivery systems known to 
.5 those of ordinary skill in the art. including nucleic acid expression svstems. bacterial 
and viral expression systems. Appropriate nucleic acid expression systems contain the 
necessary DNA sequences for expression in the patient (such as a suitable promoter and 
terminating signal). Bacterial delivery svstems involve the administration of a 
bactenum (such as Bocillus-Calrnette-Guerrin) that expresses an immunogenic portion 
20 of the polypeptide on ,ts cell surface. In a preferred embodiment the DNA mav be 
mtroduced using a viral expression system (,<,. vaccinia or other pox vims, retrovirus 
or adenovirus), which may involve the use of a non-pathogenic (defective), replication 
competent virus. Techniques for incorporating DNA into such expression systems are 
well known to those of ordinary skill in the an. The DNA may also be "naked."as 
=5 described for example, in Ulmer etal.. Science 259:1745-1749. 1993 and reviewed by 
Cohen, Science 259:1691-, 692. 1993. The uptake of naked DNA may be increased by 

coating the DNA onto biodegradable beads, which are efficiently transported into the 

cells. 



In a related aspect a DNA vaccine as described above mav be 
30 administered simultaneously w,th or sequentially to either a polypeptide of the present 
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invent or a known M. ^ ^ „ ^ M ^ ^ 

above. For example, adro^Tadon of DNA encoding a polypeptide of the present 
■nvention. either w or in a delivery syaem as described above, may be followed 
by adnunmration of an antigen in order .„ enhance the protective immune effect of the 

5 vaccine. 



Routes and fiequency of administradon, as well as dosage will vary 
Son, individual ,„ mdividml ^ my ^ ^ ^ ^ ^ ^ 

umnuntzarion using BCG. In genenal, the phannaceunca, compositions and vaccines 
n*y be acMnisKted by injection (, g „ intmcutaneous. inuamuscular. intiavenous or 
.0 subereous), intranasal* by „ ^ ^ ( ^ , ^ 

be adnunistered for a Mo week penod. Preferably, 3 doses are administered a, 
intervals of 3-4 months, and booster vaccinations may be given periodica^ thereafter 
Alternate protocols may be appropriate fo, individual patients. A suitable dose is an 
amount of polypeptide or DNA that, when adntinisKred as described above, is capable 
.5 of taumg an immune response in an immunized patien, sufficient ,„ protect the patient 
torn M. Miosis infection for a, leas. ,-2 years. In genenl, the amount of 
polypeptide present in a dose (or produced * „,„ by the DNA in a dose) ranges from 
about 1 pg to about ,00 mg per kg of host, typcallv from about ,0 pg ,„ about 1 mg 
and preferably from about .00 pg to about 1 ug. Suitable dose si.es will varv with me 
:o srze of me patient, but will typtcally range from abou, 0.1 mL ,o about 5 mL. 

While any stable ^ ^ ,„ ^ Qf ^ m ^ m ^ 

be employed in the pharmaceutical compositions of this invention, the rvpe of carrier 
will vary depending o„ ine mode of admntisnation. For parenteral admuustiation, such 
as subcutaneous injection, the earner preferably comprises water, saline, alcohol, a fat. a 
2S wax or a buffer. For on, administration, any of the above earners or a solid carrier 
such as mannitol. lactose. aarch. magnesium stearate. sodium saccharine, talcum 
cellulose, glucose, sucrose, and magnesium carbonate, may be employed 
Btodegradable microspheres polvlaetio galactide) may also be cmploved as 
earners for the pharmaceutical compositions of this mvention. Suitable biodegradable 
30 microspheres are disclosed, for example, in U.S. Patent No, 4.897.268 and 5 075 109 
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Any of a variety of adjuvants may be employed in the vaccines of this 
invention to nonspecifically enhance the immune response. Most adjuvants contain a 
substance designed to protect the antigen from rapid catabolism, such as aluminum 
hydroxide or mineral oil, and a nonspecific stimulator of immune responses, such as 
i lipid A, Bortadella pertussis or Mycobacterium tuberculosis. Suitable adjuvants are 
commercially available as, for example, Freund's Incomplete Adjuvant and Freund's 
Complete Adjuvant (Difco Laboratories) and Merck Adjuvant 65 (Merck and 
Company, Inc., Rahway, NJ). Other suitable adjuvants include alum, biodegradable 
microspheres, monophosphoryl lipid A and quil A. 

In another aspect, this invention provides methods for using one or more 
of the polypeptides described above to diagnose tuberculosis using a skin test. As used 
herein, a "skin test" is any assay performed directly on a patient in which a delayed-type 
hypersensitivity (DTH) reaction (such as swelling, reddening or dermatitis) is measured 
following intradermal injection of one or more polypeptides as described above. Such 
injection may be achieved using any suitable device sufficient to contact the 
polypeptide or polypeptides with dermal cells of the patient, such as a tuberculin 
syringe or 1 mL syringe. Preferably, the reaction is measured at least 48 hours after 
injection, more preferably 48-72 hours. 

The DTH reaction is a cell-mediated immune response, which is greater 
in patients that have been exposed previously to the test antigen (i.e.. the immunogenic 
portion of the polypeptide employed, or a variant thereof). The response mav be 
measured visually, using a ruler. In general, a response that is greater than about 0.5 cm 
in diameter, preferably greater than about 1.0 cm in diameter, is a positive response, 
indicative of tuberculosis infection, which may or may not be manifested as an active 
disease. 

The polypeptides of this invention are preferably formulated, for use in a 
skin test, as pharmaceutical compositions containing a polypeptide and a 
physiologically acceptable carrier, as-described above. Such compositions typically 
contain one or more of the above polypeptides in an amount ranging from about 1 us to 
about 100 ug. preferably from about 10 ug to about 50 ug in a volume of 0.1 mL. 
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Preferably, the carrier employed in such pharmaceutical compositions is a saline 
solution with appropriate preservatives, such as phenol and/or Tween 80™. 

In a preferred embodiment, a polypeptide employed in a skin test is of 
sufficient size such that it remains at the site of injection for the duration of the reaction 
period. In general, a polypeptide that is at least 9 amino acids in length is sufficient. 
The polypeptide is also preferably broken down by macrophages within hours of 
injection to allow presentation to T-cells. Such polypeptides may contain repeats of one 
or more of the above sequences and/or other immunogenic or nonimmunogenic 
sequences. 



The following Examples are offered by way of illustration and not by 
way of limitation. 



30 



EXAMPT.FS 
EXAMPT.F. 1 

Purification and CHARArTP R1ZATinN nF Pn , v» clrriTW 
FROM M Tr- wi'Losrs d n n j RE Filtratf 

This example illustrates the preparation of M. tuberculosis soluble 
polypeptides from culture filtrate. Unless otherwise noted, all percentages ,n the 
following example are weight per volume. 

M. tuberculosis (either H37Ra. ATCC No. 25177, or H37Rv, ATCC 
No. 25618) was cultured in stenle GAS media at 37»C for fourteen days. The media 
was then vacuum filtered (leaving the bulk of the cells) through a 0.45 u filter into a 
sterile 2.5 L bottle. The media was next filtered through a 0.2 u filter into a sterile 4 L 
bottle and NaN 3 was added to the culture filtrate to a concentration of 0.04%. The 
bottles were then placed in a 4°C cold room. 

The culture filtrate was concentrated by placing the filtrate in a 12 L 
reservoir that had been autoclaved and feeding the filtrate into a 400 ml Amicon stir cell 
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which had been rinsed with emanol ^ COMai]Kd , ]o m ^ ^ 
Be P-sure was maintained af0 ^ ^ ^ ^ ^ ^ ^ 
ii L volume to approximately 50 ml. 

W^D, MWCO cellulose eaer membrane, ^ ^ ^ rf « 
btcartonate sohuio, mnon w ^ drannmed by a 

avalableBCA assay (Pierce. Rockford.II.). 

Tie dialyzed culture ffltratt was then lyophilized, and .he polypeptides 
suspended ,n disnlled water. ^ ^ ^ ^ ^ ( 

b.s[ms(hydr„ X y r „e I hv„. m e*yla mi no]propa„e. pH7.J (Bis-Tris propane buffer, ,he 
a* condiuons for anion exchange chromatography. Fmcuonadon was performed 
mot ge, proton chroma,ography on a POROS 146 I, Q/M anion exchange column 
4.6 nun x 100 mm (Perseprive BioSystems, F-aminghan^ MA) calibrated b 0.0, * 
B.s-Tr,s propane bnffer pH 7J. Polypeptides were el*d with a linear 0-0 5 M NaCl 

" ^ ab °" bUff " ^ ^ C0,Umi *~ ^ " " * — 

Spools of polypepudes elunngnx>md,e ion exchange column were 
dudyzed against dialled warer and lyophilized. The resulting materia, was dissolved in 
O.L. Wtaroacedc acid (TFA, pH ,., in w . ^ p0)ypeptidM 
• a Delta-Pa* CIS column (Ware,. Mil f„ rd . ^ 300 ^ ^ ^ , 

P-,e size ,3.9 x ,50 nun,. The po.ypep.ides were eluted torn .he column w,d. a 
linear gradient from 0-60% diluiion buffer in w m • . .. 

'""""""""(" ^•TFAmaceiomnile). The How rate 
was O.TSml/m.nute and *. HPLC eluen, was monitored at 214 nm. Fracuons 
coning 0* eblKd polypepUdes _ coUecKd ^ ^ ^ ^ 

The purified polypepudes were men screered for me ability to induce T- 
cel, proliferauon in PBMC prepara» 0 „s. The PBMCs to, donors known to be PPD 
Sta « posidve and whose T-cells w'ere shown to proliferate in response to PPD and 
cn.de soluble pro,™ from MTB were culrured in medium comprising RPMI ,640 
supplemented w,,h ,0% pooled human serum and 50ug/m, gentamicin. Purified 



15 

of 220 nm. 
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2 ™»- tal «k We,, f0r ^ of ^ ^ 35 ^ 

Tie p,*es ta ^ ^ , ^ rf ^ ■ 

*. — m p^c, m M ^ ^ f01d 

<Hus A , HL^rrr,;: M ™ ed - 

, u,e go.LA.)inPBSforfour hours at room temperature 

Wells were then blocked wirh pnc . • • - """re 
locked with PBS contains >% (W/V) non-fat ^ for , 
room temper. ^ phtK ^ ^ ^ ^ ^ 

was added a, a ,.„„o dihition a tBS/5% ^ ^ ^ ^ ^ 

^cuhation a, toop, the p|>Ks _ wash£d ^ sQ • 

rz:ii;oir,: E t 20mm T ,N ^- °"— 

amusing 37 0 „ ^ areferace Flactjons 

- ^ ■*""« — - OD two fo,d ^ tlle _ QD ^ ^ 

- -dinm atone. p te 3 standard devia , jons ^ ^ 

™ (Perttin B F " !e, r inS ' "* P °' yPeP,ideS ~" MvidUa, ' y *« — B ^-= 

fr„ m ,h ■ ^ !eqUe,,Cer ' ^ P°' w"» sequenced 

«■» tennn* ffid ^ ^ ^ ^ ^ ^ 

~ was de^ for each p„ ly pe pdde by compaijns ^ ^ ^ 

PTH ammo ac.d derivative ,„ d,= app^e PTH deriva>ive s.andards. 
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N-^^"™ 8 ^ «*» 4. Mowing 

iN-temunal sequences were isolated: 

(a) ^^V.l-A.p.Ala.Vd.ne.A^TT.r.Th.X^A^T^ly. 
Gb.Val.Val-Ala-AJa-Leu;(SEQIDNo. 54) 

(b) Al^VM^,^!^^^^^ 

Ser;(SEQIDNo. 55) 
Ala-Lys-Glu-Gly-Arg; (SEQ ID No. 56) 

Pro; (SEQ ID No. 57) 
(e) W^t,^^^ 
(SEQ ID No. 58) 



(f) 

15 (g) 



20 



Ala-GIu^lu-Ser-Ile-Ser-TTar-Xaa-Glu-Xaa-IIe-Val-Pro; (SEQ ID 
No. 59) 

Asp-P^lu-Pro-.-Ua-P^PrcVal-Pro-Tbr-Ala-Ala-Ala-Ala- 
Pro-Pto-Ala;(SEQIDNo. 60) and 

(« ^-^ys-Thr.T^^u.Gto.Lcu-Ly^ly.Tlr.A^- 
Oly;(SEQ[DNo. 61) 
wherein Xaa may be any amino acid. 

a mixture of an,i g e„s ^ * chmnm ^ £ 

zr rr ci * ca) w,,t a 7 p ° re - «— - < i« 

I!:" h EbKr/APP,ied D «"° »*- - "PLC. Fra „io ns 

- «*- wim a Unear g^en, of L/^e ot 
0.05 % TM) , _ (0 05% WA) . flow £ 

"~ - ' ■» was ofcained ^ „ ^ 
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have a »^ weigh, of I2 . 054 Rd (by ^ ^ ^ 

terminal sequence: 

(0 A^P^Ala-Ser-Ala-Pro-Asp-Val-Pra-Tta-Ala-Ala-Gln-Gb- 

Tir-S.r.Leu-Leu.Asn.Asn-Leu-Ala-Asp-Pro.Asp.Val.S^.Pte. 
Ala-Asp (SEQ ID No. 62). 

This polypi was shown * indue proliferadon and IFN- r r^ucnon m PBMC 
preparauons using the assays described above. 

Additional soluble antigens were isolated from H ^rajos* culture 
nW as fo,,ows. * ^ flltraK . ^ „ ^ 

FoUowmg dialysis agains, Bi, Tr is propane buffer, a, pH ,5. ftac.iona.ion was 
performed using anron exchange chroma.og.phv on a Poros QE co,umn 4.6 x ,00 mm 
(Perceptive Biosystems, eoui liDrat ed i„ Bi s -T„s propane buffer pH 5,. Po.ypep.ides 
were elu,ed with a .inear 0-,, M Nad gradien, in the above buffer system a, a flow 
rate of 10 ml/mm. The c ol um„ eluen, was monitored a. a wavetagmof 214 nm 

The fractions during from me ion exchange column were pooied and 
peered .„ reverse phase chromatography using a Poros R2 column 4.6 x 100 mm 
(Persepuve Biosystems). Polypeptides were eluted from me column wim a iinear 
gradient from 0-100% acetonitrile miVTPu 

ceionitnle (0.1 / 0 TFA) at a flow rate of 5 ml/ram. The eiuent 

was monitored at 214 nm. 

Fractions contammg the eluted polypeptides were lyophilized and 
suspended in 80 ,1 of aqueous 0, % TFA and ^ suited to reverse phase 
ch^otnatography on a Vydac C4 column 4, x , 50 mm (Western Analytical. Temecul, 
CA) wuh a Imear gradtent of 0 -,00% acetomtnle ,0.1% TFA) at a flow rate of ■> 
ml/mm. Eiuent was monitored at 214 nm. 

The fraction with biological activity was separated into one major peak 

Plus ot^er smaller components. Western blot of thts peaic onto PVDF membrane 

revealed three major bands of molecular we.ghts 14 Kd. 20 Kd and 26 Kd. These 

polypeptides were determined to have the fol low,™ m , • . 

u nave tne following N-terminal sequences, respectively 

0) Xaa -A3p-Ser-Gl U -Lys-Ser-Ala-Thr-Ile-Lys-Val-Thr-Asp.Ala- 

Ser; (SEQ ID No. 134) 
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(k) AJa-Gly-Asp-Thr-Xaa-Ile-Tyr-rie-Val-Gly-Asn-Leu-rhr-Ala- 
Asp; (SEQ ID No. 135) and 

(1) Ala-Pro-Glu-Ser-Gly-AIa-Gly-Leu-Gly-Gly-Thr-Val-Gln-Ala- 
Gly; (SEQ ID No. 136), wherein Xaa may be any amino acid 
5 Using the assays described above, these polypeptides were shown to induce 
prohferation aad IFN-y production in PBMC preparations. Figs. 1A and B show the 
results of such assays using PBMC preparations from a first and a second donor, 
respectively. 

DNA sequences that encode the antigens designated as (a), (c), (d) and 
(g) above were obtained by screening a genomic M. tuberculosis librarv using »P end 
labeled degenerate oligonucleotides corresponding to the N-terminal sequence and 
containing M. tuberculosis codon bias. The screen performed using a probe 
corresponding to antigen (a) above identified a clone having the sequence provided in 
SEQ ID No. 101. The polypeptide encoded by SEQ ID No. 101 is provided in SEQ ID 
No. 102. The screen performed using a probe corresponding to antigen (g) above 
•dentified a clone having the sequence provided in SEQ ID No. 52. The polypeptide 
encoded by SEQ ID No. 52 is provided in SEQ ID No. 53. The screen performed 
using a probe corresponding to antigen (d) above identified a clone having the sequence 
proved in SEQ ID No. 24. and the screen performed with a probe corresponding to 
antigen (c) identified a clone having the sequence provided in SEQ ID No: 25. 

The above amino acid sequences were compared to known amino acid 
sequences in the gene bank usmg the DNA STAR system. The database searched 
contains some 173,000 proteins and ,s a combination of the Swiss, PIR databases along 
with translated protein sequences (Version 87). No significant homologies to the ammo 
acid sequences for antigens (a)-(h) and (1) were detected. 

The amino acid sequence for antigen (i) was found to be homologous to 
a sequence from M. leprae. The full length M. leprae sequence was amplified from 
genomic DNA using the sequence obtained from GENBANK. This sequence was then 
used to screen the M. tuberculosis library described below in Example 2 and a full 
length copy of the M. tuberculosis homologue was obtained (SEQ ID No. 99). 
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The amino acid sequence for antigen (j) was found to be homologous to 
a known M. tuberculosis protein translated from a DNA sequence. To the best of the 
inventors' knowledge, this protein has not been previously shown to possess T-cell 
stunulatory activity. The amino acid sequence for antigen (k) was found to be related to 
a sequence from M. leprae. 

In the proliferation and IFN-y assays described above, using three PPD 
positive donors, the results for representative antigens provided above are presented in 
Table 1: 



10 



TABLE 1 

Rf SULTS OF PRMC PRO! .Er»AT, ON ANn IFN . V A ^ AVg 




In Table 1. responses that gave a stimulation index (SI) of between 2 and 
1 5 4 (compared to cells cultured in medium alone) were scored as +. an SI of 4-8 or 9-4 at 
a concentration of 1 ug or less was scored as -h- and an SI of greater than 8 was scored 
as — . The antigen of sequence (i) was found to have a h.gh SI (+++) for one donor 
and lower SI (+- and -) for the two other donors in both proliferation and IFN-y assays 
These results indicate that these antigens are capable of inducing proliferation and/or 
20 interferon-y production. 
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EXAMPT , F 7 

5 Ivsa, v, ^ CXamPle iUUStrateS *» iS ° ,ati0n ° f * miosis 

5 '^--ngwiths^ 

2 o/ 0 NP40 , T **** M tUberCUl ° SiS ^ (DifC ° Lab ° ratorieS) - — - a 
NP40 sohtton, and aJtemately homogenized and sonicated three tin.es TT* 

^ SUSPCnSi0n " CCnttifil8ed " ,3 '° 00 ^ * ■ — -d the 

supernatant put through a 0 7 m i„,„„ U1C 

10 Pr»DF AP , , ^ ^ e filtrate was bound lo Macro 

71 r 7??*** HerCU ' K ' CAl ^ «- <— «" 

*J*-d a^ns, 10 ^ Trjs , pH „ DiaiyKd - 

DNase and RNase at 0 fK r«„/ 7 r . 

^ aSea,0 - 05m ^'f°r30n U „. a , ro o mtemperatllrcaildthen 

B.oRad). Fracaons « combined ^ ^ ^ concmMed 

(Amicon. Beverley MA) anH th™ J L «nprep iu 

y. MA) and fen screened by Western Wot for ^ ica] 

«. a «■ pool from « n*™**^ ^ ^ » 

■rnmunoreacnve wnh otnCT antigens of ^ ^ .^.^ 

M PVDF A b J" &aCti0n SDS " PAGE - — *"«« » 

^ " T B Kd was cut „„, y , elding 

(n.) ^.Ty,.,| e . AJa . Ty , x ^. Tte .^. AMly _ I]e ^_ pro _ G|jL ^ 

Ile-Asn-Val-His-Lcu-Vai; ,SEQ !D No. 137), wherem Xaa mav 

be any amino acid. 

Comparison of this sequence with those in the gene bank as described 
above, reveaied no significant honw , ogies ^ 

obtain , A Se " UenCe *" enCOdK *" ""^ **— d - ab°- was 
b^ned by screernng a g CTOmic M - a4 _ fe;j ^ ^ 

generate oii g o„uc,cou des corresponding ,o ,e N-termin*, sequ.ee of SHQ ID 
*" having the DNA sequence provided in SEQ !D NO 
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2 2 r r~° fomd •* — p^cd * seq © 

Companson oTto. sequences ^ ^ „ tfc genebank ^ ^ 

~ y identic in m 

5 EXAMPT .F l 

TWs example illustrates the preparation of DNA sequences encoding 

IT w ! ,nfec,ed " ° r -* — — — 

soluble M tuberculosis antigens. 
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Genomic DNA was isotated fen, m e M. r^rculosis strata H37Ra. Tie 
Lambda ZAP expression svstem (Stratagene , u ^ ^ ^J^J * 
Z , ^ ^ ° f * e * — H37Rv an, 

cultures. Specficallv. me rabb,, was fa immunized subcu^eousiv w,m 200 m of 

protein antigen in a total volume „f ■> ~i ' 

,r„ ihi „ k , . C ° m * m,,g 10 « mi ™y' dipeptide 

(Laibiochem, La Jolla. CA^ anH i - k f 

me rabbi, - »f mcomplete Freund's adjuvant. Four weeks later 

*e rabb,, was boosted subcuta.^ ^ Irj0Mg ^ 

^uvant. Fmall, * rabw , was immuni2ed ^ ^ ]aKr 

Pro-em anttgen. ^ ^ we „ Bed „ ^ ^ 

S^broc* eta,.. ^ cloning: A ^ ^ ^ 
Laboratory Cold Spring Harbor. NY, , 98 , Bactenopnagc ^ 

immunoreact ve antigens were n ,^r j nu pressing 
antigens were punfied. Phagermd from the plaques was rescued and 
the nucleotide sequences of the M. tuberculosis clones deduced. 

Thirry two clones were purified. Of these. 25 represent sequences that 
have not been prev.ouslv ldemified in human M ^ 
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«re exposed and purified antigens ^ m ^ ^ ^ 

B-Pie I . Pro.eins were ^ by ^ ^ ^ fcy ^ ^ _ ^ 
SteUcy e«al., ./ ^ j« /{ ; :1527 . 1537 , 1995 ReprBOTtative rf 
molecujes identified in Ifais screen are provided in SEQ ID Nos.: ,-25 Tie 
5 corresponding predicted amino acid sequences are shown in SEQ ID Nos. 63-87. 

On comparison of fee sequences win, too™ sequences in tire gene 
bank usmg tire databases described above, i, was found tha, the clones referred ,o 
hereinafter as TbRA2A, TbRAI6, TbRA18, and TbRA29 (SEQ ID Nos. 76, 68 70 75) 
show some homology to sequences previously identified in Mycot^r,^ ^ ta 
.0 no, ,n M. TbRA2A was found to be a lip.pro.ein. with a * residue 

bprdanon sequence being located adjacen, ,o a hydrophobic secret sequence 
TWA1 TbRA26, TbRA28 and TbDPEP (SEQ !D Nos, 65. 73, 74. 53, bave been 
P-ously identified ur « luberculosis , No _ ^ 

T»AI. TbRA3, nRA4, TbRA9, TKA.0. TWA.3. TbRAI7, TbRa,9, T*A» 
TdRAj2. TbRA36 and the overtaking clones TbRA35 and TbRA12 (SEQ ID Nos 63 ' 
77. 81. 82. 64, 67, 69, 7,. 75, 78, 80, 79, 66). TTre clone TbRa24 is overlapping ^ 
clone TbRa29. 

The results of PBMC proliferation and interferon-y assavs performed on 
representative recombinant arrtigen, a„d usmg T -c=„ preparations from several 
-o different M. ^o S ,, immme paIients . „ fama>i h ^ , 

respectively. 
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In Tables 2 and 3, responses that gave a stimulation index (SI) of 
between 1.2 and 2 (compared to cells cultured in medium alone) were scored as + a SI 
of 2-4 was scored as +, as SI of 4-8 or 2-4 at a concentration of 1 ug or less w ^red 
as -H- and an SI of greater than 8 was scored as In addition, the effect of 

5 concentration on proliferation and interferon-y production is shown for two of the above 
antigens in the attached Figure. For both proliferation and interferon-v vtodae6o0m 
TbRa3 was scored as 4+ and TbRa9 as +. 

These results indicate that these soluble antigens can induce proliferation 
and/or interferon-y production in T-cells derived from an M. tuberculosis^^, 
) individual. 



TO IDENTIFY DNA SEOnF^rP. Fk,^^ g M TtmFPru , ^ 

The genomic DNA library described above, and an additional H37Rv 
library, were screened using pools of sera obtained from patients with active 
tuberculosis. To prepare the H37Rv library, M. tuberculosis strain H37Rv genomic 
DNA was isolated, subjected to partial Sau3A digestion and used to construct an 
expression library using the Lambda Zap expression system (Stratagene. La Jolla. Ca). 
Three different pools of sera, each containing sera obtained from three individuals with 
active pulmonary or pleural disease, were used in the expression screening. The pools 
were designated TbL. TbM and TbH, referring to relative reactivity with H37Ra lysate 
(i.e.. TbL - low reactiv,rv. TbM - medium reactivity and TbH = high reactivirv) in both 
ELISA and immunoblot format. A fourth pool of sera from seven patients with acuve 
pulmonary tuberculosis was also employed. All of the sera lacked increased reactivity 
with the recombinant 38 kD M. tuberculosis H37Ra phosphate-binding protein. 

All pools were pre-adsorbed with E. coli lysate and used to screen the 
H37Ra and H37Rv expression libraries, as described in Sambrook «al. Molecular 
Cloning: A Laboratory Manual. Cold Spring Harbor Laboratories. Cold Spring Harbor, 
NY, 1989. Bacteriophage plaques expressing immunoreactive antigens were purified 
Phagemid from the plaques was rescued and the nucleotide sequences of the 
M. tuberculosis clones deduced. 
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Unity two clones were purified. Ofthese,31 represented sequences that 
had not been previously identified in human M. tuberculosis. Representative sequences 
of the DNA molecules identified are provided in SEQ ID Nos, 26-51 and 105 Of 
these, TbH-8-2 (SEQ. ID NO. 105) is a partial clone of TbH-8, and TbH-4 (SEQ ID 
5 NO. 43) and TbH-4-FWD (SEQ. ID NO. 44) are non-contiguous sequences from the 
same clone. Amino acid sequences for the antigens hereinafter identified as Th38-1 
TbH-4, TbH-8, TbH-9, and TbH-12 are shown in SEQ ID Nos.: 88-92. Comparison of 
these sequences with known sequences in the gene bank using the databases identified 
above revealed no significant homologies to TbH-4, TbH-8, TbH-9 and TbM-3 
.0 although weak homologies were found to TbH-9. TbH-12 was found to be homoloeous' 
to a j4k D antigenic protem previously .dentified in M. paratuberculosis (Acc 
No. S28515), Tb38-1 was found to be located 34 base pairs upstream of the open 
readmg frame for the antigen ESAT-6 previously identified in M. W (Acc 

NO.U34848) and in M. tuberculosis (Sorensen etal., Infec. Immu, *5:,710-1717 
15 1995). 

Probes derived from Tb38-1 and TbH-9, both isolated from an H37Ra 
horary, were used to rtentify clones m an H37Rv library. Tb38-1 hybridized to 
Tb38-1F2. Tb38-1F3. Tb38-1F5 and Tb38-1F6 (SEQ. ID NOS. 1 12. 1 !3. 1 16. 1 ,8 and 
H9). (SEQ ID NOS. 112 and 113 are non-contiguous sequences from clone Tb38- 
20 1F2.) Two open readmg frames were deduced in Tb38-IF2; one corresponds to Tb37FL 
(SEQ. ID. NO. 114), the second, a parual sequence, may be the homoiogue of Tb38-1 
and is called Tb38-IN (SEQ. ID NO. 1 15). The deduced ammo acid sequence of Tb38- 
1F, ,s presented in SEQ. ID. NO. 117. A TbH-9 probe uientified three clones in the 
H37Rv hbrary: TbH-9-FL (SEQ. ID NO. 106), which may be the homoiogue of TMT-9 
* (R37Ra), T.H.9-, (SEQ. ID NO. 108), and (SEQ. ID NO. 110), al, of wltich 

are highly related sequences to TbH-9. The deduced amino acid sequences for these 
three clones are presented in SEQ ID NOS. 107, 109 and 111. 

Further screerung of the M. tuberculosis genomic DNA library as 
descnbed above, resulted in the recovery often additional reactive Cones, representing 
30 seven d lff erent genes. One of these genes was .dentified as the 38 Kd antigen discussed 
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above. one was teennmtd ,„ ^ ^ b fc ^ ^ ^ 
proteu .previous* shown to be present in M and a third was determined 

» be tdentica, ,o the antigen TbH-8 describe above, The determined DNA sequences 
for the rernairung gve ciones (hereinafter referred to as TbH-29, TbH-30, ToH-32 and 
TW-«) are provided in SEQ ID NO: ,*,«. respective*, with the corresponding 
predtcted amino acid sequences King provided in SEQ ID NO: ,42-145. respective* 
Tie DNA and amino acid sequences for these antigens were compared with those in die 
8 «ne bank as described above. N„ homologies were found to the 5' end of 
(whtch cotaatns me reactive open reading frame,, ahhough me 3- end of TbH-29 „L 
■o found to be identica, to the « , ubera , losls ^ Y227 ^ ^ ^ ^ 

found to be .dcca, to me previous* .dentified M. ,„ beralhsis ^ dM 
WHO and to me M a ^ losis cosmid ^ ^ 
to TbH-30 were found. 

Positive phagemid from mis additional screening were used to infect £ 
^XL-,B,ueM RF ..asdes CT ibed ta San,broo t e t a l ..^. 
proem was acccrnp,ish=d by the addition of IPTO. induced and uninduced iysates 
were run in duphca* on SDS-PAGE and transferred ,„ nirroceltaiose nlters Filter, 
-re reaped with human M. .uteres sera „ :200 dilution, reactive with TbH and a 
n.bb„ sera (1 :200 or 1:250 dilution, reactive w,,h me N-termina, 4 K d ponton of ,acZ 
o ^^ationswereperformedforihoursatroorntemperature. Bound antibodv was 
^ by addition of ^abeied Protem A and subsequent exposure ,o f.hn for 
vanable times ranging from 16 hours » 1 1 days. The resuits of the immunobiots are 
summarized in Table 4. 
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TABLE 4 



A^'gen Sera 

TbH-29 45 Kd 

TbH-SO No reactivity 

TbH-32 12 Kd 



Human M. tb Anti-lacZ 



TbH- 



10 



Sera 

45 Kd 
29 Kd 
12 Kd 

16 Kd 16 Kd 



Positive reaction of the recombinant human M. tuberculous antieens 
wuh both the human M. tuberculosa sera and an.-.acZ sera mdicate that reactivity of 
the human M. tuberculosis sera ,s directed towards the fus:on protein. Antigens 
reacnve with the anti-lacZ sera but not with the human M. tuberculosis sera may be the 
15 result of the human M. tuberculosis sera recognizing confonnationaJ epitope, or the 
-gen-antibody bmding Kneucs may be such that the 2 hour sera exposure m the 
immunoblot is not sufficient. 

The results of T-cell assays performed on Tb38-I. ESAT-6 and other 
=0 representative recombmant antigens are presented m Tables 5A. B and 6. respecnvelv 
below: 




WO 99/42076 PCT7US99/03268 

51 



TABLE 5B 

RESULTS OF PBMC INTERFERON-V PRODUCTION TO REPRESENTATIVE ANTIGENS 



Antigen 




Donor 




1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


n 


Tb38J 


-H-+ 








+ 






-H- 








ESAT-6 


+++ 


+ 




+ 


+- 


+ 




+ 








TbH-9 


++ 


++ 






± 


± 


■4-W- 











5 

TABLE 6 

Summary of T-cell Responses to Representattve Antigens 



Antigen 


Proliferation 


interferon-y 


total 


patient 4 


patient 5 


patient 6 


patient 4 


patient 5 


patient 6 


TbH9 












-H- 


13 


TbM7 








+T 


+ 




4 


TbH5 




+ 








++ 


8 


TbL23 




+ 


± 


-H- 






7.5 


TbH4 










out- 


+ 


7 


- control 














0 



1° These results indicate that both the inventive M. tuberculosis antigens 

and ESAT-6 can induce proliferation and/or interferon-y production in T-cells derived 
from an \L tuberculosis-immune individual. To the best of the inventors' knowledge. 
ESAT-6 has not been previously shown to stimulate human immune responses 

A set of six overlapping peptides covering the amino acid sequence of 

15 the antigen Tb38-1 was constructed using the method described in Example 6. The 
sequences of these peptides, hereinafter referred to as pep 1-6, are provided in SEQ ID 
Nos. 93-98, respectively. The results of T-cell assays using these peptides are shown in 
Tables 7 and 8. These results confirm the existence, and help to localize T-cell epitopes 
within Tb38-1 capable of inducing proliferation and interferon-y production in T-cells 

20 derived from an Xf. tuberculosis immune individual. 
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Studies were undertaken to determine whether the antigens TbH-9 and Tb38-1 
represent cellular proteins or are secreted into M. tuberculosis culture media. In the first 
study, rabbit sera were raised against A) secretory proteins of M. tuberculosis, B) the known 
secretory recombinant Af. tuberculosis antigen 85b, C) recombinant Tb38-1 and D) 
recombinant TbH-9, using protocols substantially the same as that as described in Example 
3A. Total M. tuberculosis lysate, concentrated supernatant of M. tuberculosis cultures and 
the recombinant antigens 85b, TbH-9 and Tb38-1 were resolved on denaturing gels, 
immobilized on nitrocellulose membranes and duplicate blots were probed using Jrabbit 
sera described above. 

The results of this analysis using control sera (panel 11 and antisera (panel II) 
against secretory proteins, recombinant 85b, recombinant Tb38-1 and recombinant TbH-9 are 
shown in Figures 3A-D, respectively, wherein the lane designations are as follows: 1) 
molecular weight protein standards: 2) 5 ug of M. tuberculosis lysate; 3) 5 ug secretory 
proteins: 4) 50 ng recombinant Tb38-1; 5) 50 ng recombinant TbH-9; and 6) 50 ng 
recombinant 85b. The recombinant antigens were engineered with six terminal histidine 
residues and would therefore be expected to migrate with a mobility approximately 1 kD 
larger that the native protein. In Figure 3D, recombinant TbH-9 is lacking approximately 10 
kD of the full-length 42 kD antigen, hence the significant difference in the size of the 
immunoreactive native TbH-9 antigen in the lysate lane (indicated by an arrow). These 
results demonstrate that Tb38-1 and TbH-9 are mtracellular antigens and are not actively 
secreted by M. tuberculosis. 

The finding that TbH-9 is an intracellular antigen was confirmed bv 
determming the reactivity of TbH-9-specific human T cell clones to recombinant TbH-9. 
secretory M. tuberculosis proteins and PPD. A TbH-9-specific T cell clone (designated 
131TbH-9) was generated from PBMC of a healthy PPD-positive donor. The proliferative 
response of 131TbH-9 to secretory proteins, recombinant TbH-9 and a control M. 
tuberculosa antigen. TbRal 1, was determined by measuring uptake of tritiated thymidine, as 
described in Example 1. As shown in Figure 4A. the clone 131TbH-9 responds specifically 
to TbH-9. showing that TbH-9 is not a significant component of M. tuberculosis secretory 
proteins. Figure 4B shows the production of IFN-y by a second TbH-9-specific T cell clone 
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(des.gna.ed PPD MO.,0, prepared ta PBMC ^ a ^ ^ 
following stimuladon of the T eel! Con, with sectary p^ins, PPD or ^ 
These results fonher confirm that TbH-9 is no, secreted by M. mberculosis. 

W\ StOUFNm RNrnn^r. ^ r| tMr ,„ n „, ^y ^-* iUEEfflpy 

Genomic DNA was isolated Son, M. tuberculom Erdman strain, randomly 
sheared and used to cons**, an expression library employing the Lambda ZAP expression 
system (Straragene. La Mia. CA). The resulting Hbrary . ^ ^ „ f ^ 

obuuned from indiv,dua,s w, m exropulmona^ tuberculosa as described above in Example 
3B, wnh me secondary audbody being goat anti-human IgG + A + M (H+L) conjugated with 
alkaline phosphatase. 

Eighteen clones were punfied. Of these. 4 clones (hereinafter referred to as 
XP14, XP24. XP31 and XP32) were found to bear some similarity to known sequences T^e 
determmed DNA sequences for XP14. XP24 and XP31 are provided in SEQ ID Nos • 156- 
1.8. respectively, with the 5" and 3' DNA sequences for XP32 being provided in SEQ ID 
Nos, b9 and 160. respectively. Th e predicted ^ ^ for ^ ^ ^ 

SEQ ID No: 161. The reverse complement of XP14 Was found to encode the amino acid 
sequence provided in SEQ ID No.: 162. 

Comparison of the sequences for the remammg 14 clones (hereinafter referred 
to as XP1-XP6. XP17-XP19. XP22. XP25. XP27, XP30 and XP36) wuh those ,n the 
genebank as described above, revealed no homologies with the exception of the 3' ends of 
XP2 and XP6 which were found to bear some homology to known M. tuberculosa cosmids 
The DNA sequences for XP27 and XP36 are shown in SEQ ID Nos, 163 and 164 
respectively, with the 5" sequences for XP4. XP5, XP17 and XP30 being shown in SEQ ID 
Nos: 16.-168. respectively, and the 5' and 3' sequences for XP2. XP3, XP6. XP18. XP19 
XP22 and XP25 being shown in SEQ ID Nos: 169 and 170; 171 and 172; 173 and 174- 175 
and 176; 177 and ,78; 179 and 180; and 181 and 182. respectively. XP, was found to 
overlap w,th the DNA sequences for TbH4. disdosed above. The ftilMength DNA sequence 
for TbH4-XPl ,s provided i» SEQ ID No, ,83. This DNA sequence was found to contain an 
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open reading frame encoding the amino acid sequence shown in SEQ ID No- 184 The 
reverse compiement of TbH4-XPl was found to contain an open reading frame encod ing the 
ammo ac:d sequence shown in SEQ ID No, 185. The DNA sequence for XP36 was found to 
contain two open reading frames encoding the amino acid sequence shown in SEQ ID Nos ■ 
186 and 187, with the reverse complement containing an open reading frame encoding the 
amino acid sequence shown in SEQ ID No.: 188. 

Recombinant XP1 protein was prepared as described above in Example 3B 
with a metal ion affinity chromatography column being employed for purification As 
Ulustrated in Figures 8A-B and 9A-B, using the assays described herein, recombinant XPl 
was found to stimulate cell proliferation and IFN-y production in T cells isolated from an M 
tubercuiosis-imimme donors. 



D " iVr. m A [u^!it:! serum pn °: fro m pattf ™ ua - ™ 

IDENTIFY UNA SEQUENrFSFNCQDiNC, V rr^r TO flwA ^~ 

Genomic DNA was isolated from M. tuberculosis Erdman strain, randomly 
sheared and used to construct an expression library employing the Lambda Screen expression 
system (Novagen. Madison. WI), as described below in Example 6. Pooled serum obtained 
from M. /.W^-infected patients and that was shown ,o react with M. tuberculosis 
lysate but not with the prev,ously expressed protems 38kD. Tb38-1. TbRaS. TbH4. DPEP and 
TbRal 1. was used to screen the express^ library as described above in Example * B with 
the secondary antibody being goat anti-human IgG + A - M (H+L) conjugated with alkaline 
phosphatase. 

Twenty-seven clones were purified. Comparison of the determined cDNA 
sequences for these clones revealed no significant homologies to 10 of the clones (hereinafter 
referred to as LSER-10, LSER-11, LSER-12. LSER-13, LSER-16. LSER-18. LSER-3 
LSER-24. LSER-25 and LSER-27). Th e deten nined 5' cDNA sequences for LSER-10 
LSER-11, LSER-12, LSER-13, LSER-16- and LSER-25 are provided in SEQ ID NO- 242- 
247, respectively, with the corresponding predicted amino acid sequences for LSER-10. 
LSER-12, LSER-13. LSER-16 and LSER-25 being prodded ui SEQ ID NO- ?48-^ 
respectively. The detennined full-length cDNA sequences for LSER-18. LSER-^3 LSER ^4 
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and LSER-27 are shown m SEQ ID NO: 253-256. respectively, win, cor^nding 
premcied amino acid sequences being pitied in SEQ ID NO: 257-260. The rematiung 
sevemeen clones were found » show staiiarities » unknown sequences previous* identified 
«. H The determined 5 - cDNA sequences for snceen of *ese Cones 

(heremafter referred ,o as LSER-I, LSER-3, LSER-4. LSER-5. LSER-6, LSER-8, LSER-14 
LSER-,5, LSER-,7. LSER-,9. LSER-20. LSER-22, LSER-26. LSER-28, LSER-79 and 
LSER-30, are provided in SEQ ID NO: 26,-276. respectively, with me corresponding 
predicted amino acid sequences for LSER-1, LSER-3. LSER-5, LSER-6, LSER-8, LSER-,4 
LSER-,5, LSER-,7, LSER-,9, LSER-20. LSER-22. LSER-26, LSER-28, LSER-7, ar,d 
LSER-30 bemg provided in SEQ ,D NO: 277-29,. respectively. The oeiennined fulWenph 
cDNA sequence for me eta LSER-9 is provided in SEQ ,D NO: 292. The reverse 

complement of LSER-6 fSEO ID NO- oq-m » <? 

(Sty iD NO. 29.) was found to encode the predicted amino acid 
sequence of SEQ ID NO: 294. 



W 'ubercvlom Iysale was prepared as described above in Example 7 The 
resutang maceria, was tm ^ mi by Hp[ _ c ^ ^ ^.^ ^ ^ ^ ^ ^ 

serologica, activ ily whh a serum poo, from , Ms , s . lni - mci ^ ^ ^ 
hnle or no .mmunoreacdvnv w,m o,her antigens of rj* presen, invention Rabb, t anti-sen, 
was generated agains, the rnos, reactive fraction using me method described in Exampie 3A 
The anti-sera was used ,o screen an M. n*™** Erdman suain genomic DNA expression 
Lbrary prepared as desenbed above. Baceriophage p!a,ues expressing .mmunoreacuve 
antigens were purified. Phagemm from ,he P ,aques was rescued and *. nuCeotide sequences 
of the M. tuberculosis clones determined. 

Ten different clones were purified. Of these, one was found to be TbRa35 
desenbed above, and one was found to be the previously identified M. tuberculosa ^ 
HSP60. Of the remaining eight clones, seven (hereinafter referred to as RDIF- RCIF5 

RDIF8, RDIF10. RDIF11 and RDIF P) were founH u .... 

ir were round to bear some similarity to previously 

identified M. tuberculosis sequences The rW™;,, a hm* 

Hucnces. ihe determined DNA sequences for RDIF"' RDIF5 

RD.T8. RDIHO and RD,P, , are prov,ded in SEQ ID Nos, ■89- 1 ,3. respective^ w,„ J 
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corresponding predicted amino acid sequences being provided in SEQ ID Nos: 194-198 
respectively. The 5' and 3' DNA sequences for RDIF12 are provided in SEQ ID Nos.: 199 
and 200, respectively. No significant homologies were found to the antigen RDIF-7. The 
detennined DNA and predicted amino acid sequences for RDIF7 are provided in SEQ ID 
Nos.: 201 and 202, respectively. One additional clone, referred to as RDIF6 was isolated, 
however, this was found to be identical to RDIF5. 

Recombinant RDIF6, RDIF8, RDEF10 and RDIF11 were prepared as 
described above. As shown in Figures 8A-B and 9A-B, these antigens were found to 
stimulate cell proliferation and IFN-y production in T cells isolated from M. tuberculosis- 
immune donors. 

EXAMPLF 4 

Purification and Char Arm^-noN of a Po, p ROM turfrpt » p, , BIC ,^ 

Protein Df.rivativf. 

An M. tuberculosis polypeptide was isolated from tuberculin purified protein 
derivative (PPD) as follows. 

PPD was prepared as published with some modification (Seibert, F. et al 
Tuberculin purified protem derivative. Preparation and analyses of a large quantity for 
standard. The American Review nfTnh^.i,^ ^ mi) 

M. tuberculosis Rv strain was grown for 6 weeks in synthetic medium in roller 
bottles at 37°C. Bottles containing the bacterial growth were then heated to 100" C in water 
vapor for 3 hours. Cultures were sterile filtered using a 0.22 u filter and the liquid phase was 
concentrated 20 times using a 3 kD cut-off membrane. Proteins were precipitated once with 
50% ammonium sulfate solution and eight times with 25% ammonium sulfate solution. The 
resulting proteins (PPD) were fractionated by reverse phase liquid chromatography ( RP- 
HPLC) using a C18 column (7.8 x 300 mM: Waters. Milford. MA) in a Biocad HPLC system 
(Perseptive Biosystems. Framingham, MA). Fractions were eluted from the column with a 
linear gradient from 0-100% buffer (0.1% TFA in acetonitrile). The flow rate was 10 
ml/minute and eluent was monitored at 214 nm and 280 nm. 



WO 99/42076 



PCT/US99/03268 



59 



Sue fractions were collected, dried, suspended in PBS and tested individually 
m M. tuberculosis-^ ^ pigs for ^on rf ^ ^ 

reactto. One fraction was found to mduce a strong DTH reaction and was subsequently 
donated further by RP-HPLC on a microbore Vydac CIS column (Cat. No. 218TP51 15) 
» a Perkin Elmer/Applied Easterns Division Model 172 HPLC. Fractions were eluted 
with a unear gradient from 5-100* buffer (0.05% TFA in acetonitnle) wtth a flow rate of 80 
m/rmnute. Eluent was monitored at 215 nm. Eight fractions were collected and tested for 
auction of DTH in * tuberculosa^ ^ pig , 0ne ^ m ^ ^ ^ ^ 
strong DTH of about 16 mm induration. The other fractions did not induce detectable DTH. 

The positive fraction was submitted to sns pirc „ i i 

suommed to bDS-PAGE gel electrophoresis and found to contain a 

stngle protein band of approximately 12 kD molecular weight. 

This polypeptide, herein after referred to as DPPD, was sequenced from the 
ammo termmal using a Perkin Elmer/Applied Biosystems Divtsion Procise 4*> protein 
sequencer as described above and found to have the N-termmal sequence shown in SEQ ID 
No.: 129. Companson of this sequence with known sequences in the gene bank as described 
above revealed no known homologies. Four cyanogen bromide fragments of DPPD were 
•solated and found to have the sequences shown m SEQ ID No,: 130-133. A subsequent 
search of the M lu berculosis genome database released by the Institute tor Genormc 
Research revealed a match of the DPPD partial ammo acid sequence with a sequence present 
wuhm the .V/ tuberculosis cosrmd MTY21C12. .An open reading frame of 336 bp was 
identified The full-length DNA sequence for DPPD is provided in SEQ ID NO- ^40 wtth 
the corresponding full-length ammo actd sequence being provided in SEQ ID NO: 241. 

The ability of the anugen DPPD to stimulate human PBMC to proliferate and 
to produce IFN-v was assayed as desenbed in Example 1. As shown in Table 9 DPPD was 
found to stimulate proliferation and elicit production of large quantities of IFN- y; more than 
that elicited by commercial PPD, 
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TABLE 9 

Results of Proliffr attom an d Intfrffrom-y assays rn hpph 



PBMC Donor 
A 



Stimulator 



Proliferation (CPM) 



IF N-Y(OD tS0 ) 
0.17 



Medium 



1,089 



PPD (commercial) 



8,394 



DPPD 



13,451 



1.29 



2.21 



B 



Medium 



PPD (commercial) 



450 



0.09 



3,929 



1.26 



DPPD 



6.184 



1.49 



Medium 



PPD (commercial) 



DPPD 



541 



8.907 



23,024 



0.11 



0.76 



>2.70 



EXAMPLE 5 

USE OF SERA FROM Tl.'BFRrn OSIS-lNFECTm Mnw.vc™ IDENTTFV DMA 

Encoding M. Ti-RFRri -Losis antk-.fm.: 

Genomic DNA was isolated from M. tuberculosis Erdman strain, randomly 
sheared and used to construct an expression library employing the Lambda ZAP expression 
system (Stratagene. La Jolla. CA). Serum samples were obtamed from a cynomolgous 
monkey 18. 33. 51 and 56 days following infection with M. tuberculosis Erdman stnun 
These samples were pooled and used to screen the M. tuberculosis genomic DNA express.on 
horary using the procedure described above in Example 3C. 

Twenty clones were purified. The determined 5' DNA sequences for the clones 
referred to as MO-1, MO-2, MO-4. MO-8, MO-9. MO-26, MO-28, MO-29, MO-30, MO-34 
and MO-35 axe provided m SEQ ID NO: 215-225. respectively, with the corresponding 
predicted amino acid sequences being provided in SEQ ID NO: 226-236. The full-length 
DNA sequence of the clone MO-10 is prov:ded in SEQ ID NO: 237, with the corresponding 



WO 99/42076 



PCT/US99/03268 



61 



predicted ammo acid sequence being proved in SEQ ID NO: 238. The 3' DNA sequence 
for the clone MO-27 is provided in SEQ ID NO: 239. 

Clones MO-1, MO-30 and MO-35 were found to show a high degree of 
relatedness and showed some homology to a previously identified unknown M. tuberculosis 
sequence and to cosmid MTCI237. MO-2 was found to show some homology , 
aspartokmase from * tubers. Cones MO-3, MO-7 and MO-27 were found to be 
Ldentical and to show a higb degree of relatedness to MO-5. All four of these clones showed 
some homology to M. tuberculosis heat shock protem 70. MO-27 was found to show some 
homology to M. tuberculosis cosmid MTCY339. MO-4 and MO-34 were found to show 
some homology to cosmid SCY21B4 and M. s megma tis intention host factor, and were 
both found to show some homology to a previously identified, unknown M. tuberculous 
sequence. MO-6 was found to show some homology to * tuberculosis heat shock protein 65 
MO-8. MCW, MO-1 0, MO-26 and MO-29 were found to be highly related to each other and 
to show some homology to M. tuberculosis dihydrolipamide succinyltransferase. MO-8 
MO-3 1 and MO-32 were found to be identic* and to show some homology to a previous^ 
^enttfied M. tuberculosis protein. MO-33 was found to show some homology to a 
Piously identified 14 kDa M. tuberculosis heat shock protein. 

Further studies using the above protocol resulted in the isolation of an 
additional four clones, hereinafter referred to as MO-12. MO-13. MO-19 and MO-39 The 
determmed y cDNA sequences for these clones are proved in SEQ ID NO- ^98 
respectively, with the corresponding predicted protem sequences bemg provided in SEQ ID 
NO: 299-302. respecnvely. Comparison of these sequences with those in the gene bank as 
descnbed above revealed no significant homologies to MO-39. MO-12, MO-13 and MO-19 
were found to show some homologies to unknown sequences previouslv isolated from 4/ 
tuberculosis. 



EXAMPI.F 6 

Is olation of ONa S Ej ^yENCESENconiNr. u n^rr,n„ ^ M 

BY SCRgEN.N O OF . NOVEL F.XPPP^, T , DP . py 
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This example illustrates isolation of DNA sequences encoding M. tuberculosis 
antigens by screening of a novel expression library with sera from M. tuberculosis-^* 
patients that were shown to be unreactive with a panel of the recombinant M. tuberculosis 
antigens TbRal 1, TbRa3, Tb38-1, TbH4, TbF and 38 kD. 

Genomic DNA from M. tuberculosis Erdman strain was randomly sheared to 
an average size of 2 kb. and blunt ended with Klenow polymerase, followed by the addition 
of EcoRI adaptors. The insert was subsequently ligated into the Screen phage vector 
(Novagen, Madison, WI) and packaged in vitro using the PhageMaker extract (Novagen). 
The resulting library was screened with sera from several M. tuberculosis donors that had 
been shown to be negative on a panel of previously identified M. tuberculosa antigens as 
described above in Example 3B. 

A total of 22 different clones were isolated. By comparison, screemng of the X 
Zap library described above using the same sera did not result in any positive hits. One of the 
clones was found to represent TbRal 1. described above. The determmed 5' cDNA sequences 
for 19 of the remaining 21 clones (hereinafter referred to as Erdsnl. Erdsn2. Erdsn4-Erdsnl0. 
Erdsnl2-18, Erdsn21.Erdsn23 and Erdsn25) are provided in SEQ ID NO: 303-322. 
respectively, with the determmed 3' cDNA sequences for Erdsnl. Erd Sn 2. Erdsn4. Erdsn-5. 
Erdsn-7-ErdsnlO. Erdsnl2-18. Erdsn2 1 -Erdsn23 and Erdsn25 being provided in SEQ ID NO: 
323-341. respectively. The complete cDNA insert sequence for the clone Erdsn24 is 
proved in SEQ ID NO: 342. Companson of the determmed cDNA sequences with those in 
the gene bank revealed no significant homologies to the sequences provided in SEQ ID NO- 
309. 316. 318-320. 322. 324. 328. 329. 333. 335. 337. 339 and 34,. The sequences of SEQ 
ID NO: 303-308. 3,0-315. 3,7. 321. 323. 325-327. 330-332. 334. 336. 338. 340 and 342 
were found to show some homology to unknown sequences previously identified m M. 
tuberculosis. 



EXAMPLE 7 
Isolation of Solijbi p" m T, JBE Rrni osn amt.^mc 
USfNG Mass Spectromftpv 
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This example illustrates the use of mass spectrometry to identifv soluble M 

tuberculosis antigens. 

to a first approach, M. uberculosls culture filtrate was screened bv Western 
-*m using serum from a ruberculosis-infected individual The reacuve bands were 
exctsed from a silver sained gel and the amino acid sequences determined by mass 
spectrometry. The determined amino acid sequence for one of the isolated antigens is 
provided in SEQ ,D NO: 343. Comparison of mis sequence wtm those in the *ene bank 
revealed homology to fte 85b ^ ^ ^ 

In a second approach, me high molecular weight region of M. mberculom 
culture supernal, was srudied. This area may contain immunodominant aniens which 
may be useful ,„ me ^ s ol M . raAera , fa . ^ ^ ^ 

antibodies, 1T42 and ,T57 (avaiiable ft „m the Center for Dtsease Control, AUan,. 0A) ^ 
reactivity by Western a^ysis to antigens in mis vicinity, although U* idenutv of „ e antigens 
remams unknown. 1» ^ ^ ^ ^ ^ ^ 

as contanung a surrogate marker for M. miosis infection in HIV-positive individuals 
(M. Ma. Us., /-W33-U3. 1997). To determine the identtry of tee antigens rwo- 
dunensiona, ge, elecrropho^is at* two-dimensiona, Western analysis were performed using 
.he an,ib„d,es IT57 and IT42. Five protein spots in the high molecular weigh, recion were 
.dennhed. individually excised, enzymatically digested and subject to «s spe'etrometric 
analysis. 

The determined amino acid sequences for three of these spots (referred ,„ as 
spots . . 2 and 4) are provtded in SEQ ID NO: 344. 345-346 and 347. respechvely 
Companson of these sequences with those ,„ the gene bank revealed that spot 1 ,s the 
prevtously identified «... a phosphoenolpyruva* kmase. The two secue.es isolated from 
spot 2 were determined to he fern rwo DNAks. previ„us Iy tdentified in u , uierctllosls « 
hea, shock protem, Spot 4 was determined to be the previously identified M. ,u„er C uios,s 
pro.cn Ka, G. To the best of the inventors' knowledge, netther Paw „ or th e two DNAks 
have prevtously been shown to have utUity r„ the diagnosis of U .u^osis infection 



EXAMPT F ft 
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USE OF REPRESENTATIVE ANTTfiFNS rnp n , AGNOSIS OF T ,rp C ,r,n n„c 

This example illustrates the effectiveness of several representative 
polypeptides in skin tests for the diagnosis of M. tuberculosis infection. 

Individuals were injected intradermal^ with 100 ul of either PBS or PBS plus 
Tween 20™ containing either 0.1 ug 0 f protein (for TbH-9 and TbRa35) or 1.0 ug 0 f protein 
(for TbRa38-l). Induration was measured between 5-7 days after injection, with a response 
of , mm or greater being considered positive. Of the 20 individuals tested -> were PPD 
negative and 18 were PPD positive. Of the PPD positive individuals, 3 had active 
tuberculosis. 3 had been previously infected with tuberculosis and 9 were healthy In a 
second study, 13 PPD positive mdividuals were tested with 0.1 ug TbRal 1 in either PBS or 
PBS plus Tween 20™ as described above. The results of both studies are shown in Table 10. 

TABLE 10 

RESULTS OF DTH TESTING WITH REPRESENTATIVE ANTIGENS 




EXAMPT F 9 
Synthesis or Sv^thftip PpLYPEPTinrs 



Polypeptides may be synthesized on a Millipore 9050 peptide synthesizer 
using FMOC chemistry with HPTU (O-Benzotnazo.e-N.N.N'.N'-tetramethyluromum 
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h raafl „„ roplKSPtee) msmblL A mrma ^ h ^ b ^ ^ 

tennrnus of .he peptide to provide . method of conjugatioii _ ^ rf ^ 
Cleavage of the pepudes fo „ Mlid suppon may h ^ ^ ^ ^ ^ 

cleavage Mixture; trifluorpacedc a c id:etr^=di t W„l: 1 ruoamsol e ™e r p heil „ 1 (4 o-,.,o. 3) 
After cleaving for 2 hou*, me wtides ^ ^ ^ ^.^^J^ 

pept.de pell, may then be dissolved in water containing olIS triflucroacefc acid (TFA, a„d 
lyophriizeti pric to „ c , 8 ^ ptaK ^ a ^ ^ ^ 

acetonitrile (containing 0,% TFA) i„ water (comaining 0.1% TFA) may be used ,o ehne the 
pepades. FouWing ,y ophili2aion of fc ptm ^ ^ ^ ^ ^ 

usmg electrospray mass spectrometry aud by amino acid analysis. 



EXAMPt F in 

PREPARATION ANdCh^P lOTpp 'Tlfll I OF M 7>ffff ■//mvc p, »-p., 1^ — ^ | 

A fusion protein containing TbRaS, the 38 kD antigen and TMJ-1 was 

prepared as follows. 

Each of the DNA constructs ToRaS. 38 kD and Tb38-1 were modified bv PGR 
m order to facilitate their fusion and the subsequent expression of the fusion protetn TbRaJ- 
rfUVTb*,. ToRa,38kI>aodTb38., DNA was used to perform PCR usmg me pnmers 
PDM-64 and PDM-65 ,SEQ ID NO: ,46 and ,47>. PDM-57 and PDM-58 (SEQ ID NO- ,48 
and imandPDM^andPDM^CSEQIDNO: ,50 and ,51), respect,ve,y. ,„ each case 
*e DNA amplificahon was performed ustng 10 p, 10X Pfu buffer. 2 pi 10 rruVI dNTPs "* pj 
each of the PCR pnmets a, ,0 p M concentration. 8,.5 p, waer. ,.5 pi P h DNA polymer 
Stratagene. La Jolla. CA, and 1 p, DNA a, either 70 ng/p, (for TbRaS) or 50 „g/p, ( f or 38 
kD and To38-l, For TbRa3. denaturatton a, 94-C was performed for 2 mm, followed by 40 
cycles of 96-C for ,5 sec and 72-C for , min. and lasdy by 72-C for 4 mm. For 38 kD 
detuturauon « 9TC was performed for 2 mm, fol,„wed by 40 cycles of M-c for 30 sec 68- 
C for 1 5 sec and 72-C for 3 min. and finally by 72-C for 4 min. For Tb38-1 denaturanon a, 
C for 2 nun was foUowed by 10 cycles of 96-C for 15 sec, 68-C fo, ,5 sec and 7-C for 
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1.5 min. 30 cycles of 96°C for 15 sec, 64°C for 15 sec and 72°C for 1.5, and finally by 72°C 
for 4 min. 

The TbRa3 PCR fragment was digested with Ndel and EcoRI and cloned 
directly into pT7 A L2 IL 1 vector using Ndel and EcoRI sites. The 38 kD PCR fragment was 
digested with Sse8387I, treated with T4 DNA polymerase to make blunt ends and then 
digested with EcoRI for direct cloning into the pT7 A L2Ra3-l vector which was digested with 
StuI and EcoRI. The 38-1 PCR fragment was digested with Eco47III and EcoRI and directly 
subcloned into pT7 A L2Ra3/38kD-17 digested with the same enzymes. The whole ftision was 
then transferred to pET28b - using Ndel and EcoRI sites. The fusion construct was 
confirmed by DNA sequencing. 

The expression construct was transformed into BLR pLys S E. coli (Novagen. 
Madison, WI) and grown overnight in LB broth with kanamycin (30 ^g/ml) and 
chloramphenicol (34 ^g/ml). This culture (12 ml) was used to inoculate 500 ml 2XYT with 
the same antibiotics and the culture was induced with IPTG at an OD560 of 0.44 to a final 
concentration of 1.2 mM. Four hours post- induction, the bacteria were harvested and 
sonicated in 20 mM Tris (8.0), 100 mM NaCl, 0.1% DOC, 20 ^g/ml Leupeptin, 20 mM 
PMSF followed by centrifugation at 26.000 X g. The resulting pellet was resuspended in 8 M 
urea, 20 mM Tris (8.0), 100 mM NaCl and bound to Pro-bond nickel resin (Invitrogen. 
Carlsbad CA). The column was washed several times with the above buffer then eluted with 
an imidazole gradient (50 mM. 100 mM. 500 mM imidazole was added to 8 M urea. 20 mM 
Tris (8.0), 100 mM NaCl). The eluates containing the protein of interest were then dialyzed 
against 10 mM Tris (8.0). 

The DNA and amino acid sequences for the resulting fusion protein 
(hereinafter referred to as TbRa3-38 kD-Tb38-l) are provided in SEQ ID NO: 152 and 153. 
respectively. 

A fusion protein containing the two antigens TbH-9 and Tb38-1 (hereinafter 
referred to as TbH9-Tb38-l) without a hinge sequence, was prepared using a similar 
procedure to that described above. The DNA sequence for the TbH9-Tb38-l fusion protein is 
provided in SEQ ID NO: 156. 



WO 99/42076 



PCT/US99/03268 



67 



The ability of the fusion protein TbH9-Tb38-l to induce T cell proliferation 
and IFN-y production in PBMC preparations was examined using the protocol described 
above in Example 1. PBMC from three donors were employed: one who had been previously 
shown to respond to TbH9 but not Tb38-1 (donor 131); one who had been shown to respond 
to Tb38-1 but not TbH9 (donor 184); and one who had been shown to respond to both 
antigens (donor 201). The results of these studies (Figs. 5-7, respectively) demonstrate the 
functional activity of both the antigens in the fusion protein. 

A fusion protein containing TbRa3, the antigen 38kD, Tb38-1 and DPEP was 
prepared as follows. 

Each of the DNA constructs TbRa3. 38 kD and Tb38-1 were modified by PCR 
and cloned into vectors essentially as described above, with the primers PDM-69 (SEQ ID 
NO:150 and PDM-83 (SEQ ID NO: 205) being used for amplification of the Tb38-1A 
fragment. Tb38-1A differs from Tb38-1 by a Dral site at the 3' end of the coding region that 
keeps the final ammo acid intact while creating a blunt restriction site that is in frame. The 
TbRa3/38kD/Tb38-lA fusion was then transferred to pET28b using Ndel and EcoRl sites. 

DPEP DNA was used to perform PCR using the primers PDM-84 and PDM- 
85 (SEQ ID NO: 206 and 207. respectively) and 1 ul DNA at 50 ng/ul. Denaturation at 94 »C 
was performed for 2 min. followed by 10 cycles of 96 "C for 15 sec. 68 "C for 15 sec and 72 
°C for 1.5 min: 30 cycles of 96 >C for 15 sec. 64 °C for 15 sec and 72 <C for 1.5 mm: and 
finally by 72 °C for 4 mm. The DPEP PCR fragment was digested with EcoRI and Eco72I 
and clones directly into the pET2SRa3/38kD/38-lA construct which was digested with Dral 
and EcoRI. The fusion construct was confirmed to be correct by DNA sequencing. 
Recombinant protein was prepared as described above. The DNA and amino acid sequences 
for the resulting fusion protein (hereinafter referred to as TbF-2) are provided in SEQ ID NO: 
208 and 209, respectively. 

The reactivity of the fusion protein TbF-2 with sera from M. tuberculosis- 
infected patients was examined by ELISA using the protocol described above. The results of 
these studies (Table 11) demonstrate that all four antigens function independently in the 

fusion protein. 
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Table 1 1 



Reactivity of TbF-2 Fusion Recomb.nant with TB and Normal Sera 



EL1SA Reactivity 
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A fusion protein containing TbRa3. the antigen 38kD, Tb38-1 and TbH4 was 
prepared as follows. 

Genomic M. tuberculosis DNA was used to PCR full-length TbH4 (FL TbH4) 
with the primers PDM-157 and PDM-160 (SEQ ID NO: 348 and 349, respective.y) and , ul 
DNA at 100 ng/ul. Denaturation at 96 »C was performed for 2 min, followed by 40 cycles of 
96 °C for 30 sec, 61 °C for 20 sec and 72 °C for 5 min; and finally by annealing at V «C for 
10 min. The FL TbH4 PCR fragment was digested with EcoRJ and Sea I (New England 
B,olabs.) and cloned directly into the P ET28Ra3/38kD/38-lA construct described above 
whxch was digested with Dral and EcoRI. The fusion construct was confirmed to be correct 
by DNA sequencing. Recombinant protein was prepared as described above. The DNA and 
ammo acid sequences for the resulting fusion protem (hereinafter referred to as TbF-6) are 
provided in SEQ ID NO: 350 and 351, respectively. 

A fusion protein containing the antigen 38kD and DPEP separated by a linker 
was prepared as follows. 

38 kD DNA was used to perform PCR using the pnmers PDM-176 and PDM- 
175 (SEQ ID NO: 352 and 353, respectively), and 1 ul PET28Ra3/38kD/38-l/Ra2A-12 DNA 
at 1 1 0 ng/ui. Denaturation at 96 °C was performed for 2 min. followed by 40 cvcles of 96 »C 
for 30 sec. 71 °C for 15 sec and 72 °C for 5 min and 40 sec; and finally by annealing at 72 °C 
for 4 min. The two sets of pnmers PDM-171. PDM-172. and PDM-173. PDM-174 were 
annealed by heating to 95 "C for 2 min and then rampmg down to 25 °C slowly at 0.1 °C/sec 
DPEP DNA was used to perform PCR as described above. The 38 kD fragment was digested 
with Eco RI (New England Biolabs) and cloned into a modified P T7AL2 vector which was 
cut with Eco 72 I (Promega, and Eco RI. The modified P T7AL2 construct was designed to 
have a MGHHHHHH ammo actd coding region in frame just 5' of the Eco 7^ I she The 
construct was digested with Kpn 21 (Gibco. BRL) and Pst I (New England B.oiabs) and the 
annealed sets of phosphorylated primers (PDM-171, PDM-172 and PDM-173 PDM-174) 
were cloned in. The DPEP PCR fragment was digested with Eco RI and Eco 72 I and cloned 
into tins second construct w hl ch was digested with Eco 47 III (New England Biolabs) and 
Eco RI. Ligations were done with a ligation kit from Panvera (Madison. WI). The resulting 
construct was Rested with Ndel (New England Biolabs) and Eco RI. and transferred to a 
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modified pET28 vector. The fusion construct was confinned to be correct by DNA 
sequencing. 

Recombinant protein was prepared essentially as described above. The DNA 
and amino acid sequences for the resulting fusion protein (hereinafter referred to as TbF-8) 
are provided in SEQ ID NO: 354 and 355, respectively. 

One of skill in the art will appreciate that the order of the individual antigens 
within the fusion protein may be changed and that comparable activity would be expected 
provided each of the epitopes is still functionally available. In addition, truncated forms of 
the proteins containing active epitopes may be used in the construction of fusion proteins. 

From the foregoing, it will be appreciated that, although specific embodiments 
of . the invention have been described herein for the purpose of illustration, various 
modifications may be made without deviating from the spirit and scope of the invention. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANTS: Reed, Steven G . 

Skeiky, Yasir A.W. 
Dillon, Davin C. 
Campos -Neto, Antonio 
Houghton, Raymond 
Vedvick, Thomas S. 
Twardzik, Daniel R . 
Lodes, Michael J. 
Hendrickson, Ronald 

(ii) TITLE OF INVENTION: COMPOUNDS AND METHODS FOR IMMUNOTHERAPY 
AND DIAGNOSIS OF TUBERCULOSIS 

(iii) NUMBER OF SEQUENCES: 355 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: SEED and BERRY LLP 

(B) STREET: 6300 Columbia Center, 701 Fifth Avenue 

(C) CITY: Seattle 

(D) STATE: Washington 

(E) COUNTRY: USA 

(F) ZIP: 98104-7092 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(CJ OPERATING SYSTEM: PC - DOS /MS-DOS 

(D) SOFTWARE: Pacentln Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 05-MAY-1998 

(C) CLASSIFICATION : 

(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Maki , David J. 

(B) REGISTRATION NUMBER: 31,392 

(C) REFERENCE /DOCKET NUMBER : 210121. 411C9 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (206) 622-4900 

(B) TELEFAX: (206) 682-6031 



(2) INFORMATION FOR SEQ ID NO: If 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 766 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
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(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

CGAGGCACCG GTAGTTTGAA CCAAACGCAC AATCGACGGG CAAACGAACG GAAGAACACA 60 
ACCATGAAGA TGGTGAAATC GATCGCCGCA GGTCTGACCG CCGCGGCTGC AATCGGCGCC 120 
GCTGCGGCCG GTGTGACTTC GATCATGGCT GGCGGCCCGG TCGTATACCA GATGCAGCCG 
GTCGTCTTCG GCGCGCCACT GCCGTTGGAC CCGGCATCCG CCCCTGACGT CCCGACCGCC 
GCCCAGTTGA CCAGCCTGCT CAACAGCCTC GCCGATCCCA ACGTGTCGTT TGCGAACAAG 
GGCAGTCTGG TCGAGGGCGG CATCGGGGGC ACCGAGGCGC GCATCGCCGA CCACAAGCTG 
AAGAAGGCCG CCGAGCACGG GGATCTGCCG CTGTCGTTCA GCGTGACGAA CATCCAGCCG 
GCGGCCGCCG GTTCGGCCAC CGCCGACGTT TCCGTCTCGG GTCCGAAGCT CTCGTCGCCG 
GTCACGCAGA ACGTCACGTT CGTGAATCAA GGCGGCTGGA TGCTGTCACG CGCATCGGCG 
ATGGAGTTGC TGCAGGCCGC AGGGNAACTG ATTGGCGGGC CGGNTTCAGC CCGCTGTTCA 
GCTACGCCGC CCGCCTGGTG ACGCGTCCAT GTCGAACACT CGCGCGTGTA GCACGGTGCG 
GTNTGCGCAG GGNCGCACGC ACCGCCCGGT GCAAGCCGTC CTCGAGATAG GTGGTGNCTC 
GNCACCAGNG ANCACCCCCT NNTCGNCNNT TCTCGNTGNT GNATGA 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 752 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO:2: 



{ 2 ) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 813 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

CATATGCATC ACCATCACCA TCACACTTCT AACCGCCCAG CGCGTCGGGG GCGTCGAGCA 
CCACGCGACA CCGGGCCCGA TCGATCTGCT AGCTTGAGTC TGGTCAGGCA TCGTCGTCAG 
CAGCGCGATG CCCTATGTTT GTCGTCGACT CAGATATCGC GGCAATCCAA TCTCCCGCC" 



180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
766 



60 
12 0 
180 
240 



ATGCATCACC ATCACCATCA CGATGAAGTC ACGGTAGAGA CGACCTCCGT CTTCCGCGCA 
GACTTCCTCA GCGAGCTGGA CGCTCCTGCG CAAGCGGGTA CGGAGAGCGC GGTCTCCGGG 
GTGGAAGGGC TCCCGCCGGG CTCGGCGTTG CTGGTAGTCA AACGAGGCCC CAACGCCGGG 
TCCCGGTTCC TACTCGACCA AGCCATCACG TCGGCTGGTC GGCATCCCGA CAGCGACATA 
TTTCTCGACG ACGTGACCGT GAGCCGTCGC CATGCTGAAT TCCGGTTGGA AAACAACGAA 3 00 

TTCAATGTCG TCGATGTCGG GAGTCTCAAC GGCACCTACG TCAACCGCGA GCCCGTGGAT 360 
TCGGCGGTGC TGGCGAACGG CGACGAGGTC CAGATCGGCA AGCTCCGGTT GGTGTTCTTG 
ACCGGACCCA AGCAAGGCGA GGATGACGGG AGTACCGGGG GCCCGTGAGC GCACCCGATA 
GCCCCGCGCT GGCCGGGATG TCGATCGGGG CGGTCCTCCG ACCTGCTACG ACCGGATTTT 
CCCTGATGTC CACCATCTCC AAGATTCGAT TCTTGGGAGG CTTGAGGGTC NGGGTGACCC 
CCCCGCGGGC CTCATTCNGG GGTNTCGGCN GGTTTCACCC CNTACCNACT GCCNCCCGGN 
TTG CNAATTC NTTCTTCNCT GCCCNNAAAG GGACCNTTAN CTTGCCGCTN GAAANGGTNA 
TCCNGGGCCC NTCCTNGAAN CCCCNTCCCC CT 



420 
480 
540 
600 
660 
72 0 
752 



60 
120 
180 
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240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
813 



SSSS rr^Sf CTACTCCCGG AGGAATTTCG ACGTGCGCAT CAAGATCTTC 
SSSS SST" TTTGCTCTGT TGTTCGGGTG TGGCCACGGC CGCGCCCAAG 
ACCTACTGCG AGGAGTTGAA AGGCACCGAT ACCGGCCAGG CGTGCCAGAT TCAAATGTCC 
GACCCGGCCT ACAACATCAA CATCAGCCTG CCCAGTTACT ACCCCGAcS GAAGTCGC^G 
GAAAATTACA TCGCCCAGAC GCGCGACAAG TTCCTCAGCG CGGCCACATc" G^ScTCcI 

S^aIgc TCGGCCACAT ^rc£ S^SccI 

ISSSJSf GCTCAMGGTC taccacaacg ccggcggcac gcacccaacg 

CTctSaS ^r7rrr> ^TCGCA AGCCAATCAC CTATGACACG 

CTGTGGCAGG ctgacaccga TCCGCTGCCA CTCGTCTTCC CCATTGTTGC AAGGTGAACT 
£S5^ ««««« ACWGGTATCG ATAGCCGCCN AATGCCGGCT SctcS 
TGAAATTATC ACAACTTCGC AGTCACNAAA NAA 

(2) INFORMATION FOR SEQ ID NO:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 447 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNES3 : single 
(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 

SSSSSI CCGATAACTT CCAGCTGTCC CAGGGTGGGC AGGGATTCGC 
^Il^r GG-OGGCGa TGGCGATCGC GGGCCAGATC CGATCGGGTG GGGGGTCAC- 

?;™ C7A CCGCC7TCCT CGGCTTGGGT GTTGTCGACA aSSgSaa 
^ GTCCAACGCG TGGTCGGGAG CGCTCCGGCG GCAAGTCTCG GCATCTCCAC 

SSSSS f CACCGC3G TCSACGGCGC TCCGATCAAC TCGGCcScG 

c-gSSt 2SS TC ccggtgacgt catctcggtg aactggcaaa ccaagtSS 

A^SS~C~ TGACATTGGC CGAGGGACCC CCGGCCTGAT TTCGTCGYGG 420 

a.ALwAC.Cj CCGGCCGGCC aattgga 

447 

2) INFORMATION FOR SEQ ID NO: 5: 

'■■-) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: S04 base oairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO : 5 : 
GTCCCACTGC GGTCGCCGAG TATGTCGCCC AGCAAATGTC TGGCAGCCGC CCAACGGAAT fin 

S'c^ga SSSSSS GTTGTCGAAC CCGCCGCCGC ggaagtatcg SSSS Jo 

cI-Scgg S££?SS cggaatggcg cgagtgagga GGCGGGCAAT TTGGCGGGGC 180 

NGAG CGCCGG AATGGCGCGA GTGAGGAGGT GGNCAGTCAT GCCCAGNGTG 240 
SSSS mCCmm CCATTTGACA ATCGAGGTAG tSgcgS 

n™^ AAAACGGGNG GNGACGTCC3 NTGTTCTGGT GGTGNTAGGT GNCTGNCTGG 360 

NGTNGNGGNT ATCAGGATGT TCTTCGNCGA AANCTGATGN CGAGGAACAG GGTGTNCCCG 4-0 

NNANNCCNAN GGNGTCCNAN CCCNNNNTCC TCGNCGANAT CANANAGNCG StSgNGA 48 

NAAAAGGGTG GANCAGNNNN AANTNGNGGN CCNAANAANC NNNANNGNNG NNAGNTNGNT slo 

^NTNTTNNC ANNNNNNNTG NNGNNGNNCN NNNCAANCNN NTNNNNGNAA NNGGNTTNTT Too 

604 

(2) INFORMATION FOR SEQ ID NO : 6 : 



60 
120 
180 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 633 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

ixi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

AACCACCTCA CTAAAGGGAA CAAAAGCTNG AGCTCCACCG CGGTGGCGGC 
CTAGTGKA ™ YYYCKGGCTG CAGSAATYCG GYACGAGCAT tSSStI 
TAACGGTCCT GTTACGGTGA TCGAATGACC GACGACATCC TGCTGATCGA CACCGaSS 
J*"**" CAACCGGCCG CAGTCCCGYA ACGC^cS £££££ 

cgggatcggt ttttcgcggy gttggycgac gccgaggycs acgacgacat Sacgtcotc 

ATCCTCACCG GYGCCGATCC GGTGTTCTGC GCCGGACTGG ACCTaSSr 
GCAGACCGCG CTGCCGGACA TCTCACCGCG GTGGGCGGCC ATGaSaSc C^SStS 

GAT C ctc^C ££££ S?"™ G "" ^SSSS 

G -SS~C SSJSSf GCTrCGNCGA CACCCACGCC CGGGTGGGGC TGCTGCCCAC 

CCtSccIgC ££££ ££££ S™ ~° ™ 

633 

(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1362 base pairs 

(B) TYPE: nucleic acid" 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 



60 
120 
180 
240 
300 
360 
420 
480 
540 



AACGGCGATC 


GACGCGGCCC 


TGGCCAGAGT 


60 


AAATTTGTCA 


ACCATATTGA 


GCCCGTCGCG 


120 


GTCTATGCCG 


AGGCCCGCCG 


CGAGTTCGGC 


180 


CCGGACGAGG 


GACTGCTCAC 


CGCCGGCTGG 


240 


CAGGTGCCGC 


GTGGCCGCAA 


GGAAGCCGTC 


300 


CCCTGGTGCG 


TCGACGCACA 


CACCACCATG 


360 


GCGGCGATCT 


TGGCCGGCAC 


AGCACCTGCC 


420 


TGGGCGGCAG 


GAACCGGGAC 


ACCGGCGGGA 


480 


GCCGAATACC 


TGGGCACCGC 


GGTGCAATTC 


540 


CTGGACGAAA 


CCTTCCTGCC 


GGGGGGCCCG 


600 


GGACTGGTGT 


TCGCCCGCAA 


GGTGCGCGCG 


660 


CTCGAGCCGC 


GAACGCTGCC 


CGACGATCTG 


720 


ACCGCGTTCG 


CCGCGCTCAG 


CCACCACCTG 


780 


CGTCAGGTGG 


TCAGGCGGGT 


CGTGGGGTCG 


840 


CGCTGGACGA 


ACGAGCACAC 


CGCCGAGCTG 


900 


GCCCTGCTGA 


CCGGCCTGGC 


CCCGCATCAG 


960 


TCCCTGCTCG 


ACACCGATGC 


GGCGCTGGTT 


1020 


GCGCGGCGCA 


TCGGCACCTG 


GATCGGCGCC 


1080 


CCGACTGGGT 


GAGTGTGCGC 


GCCCTGTCGG 


1140 


GCGGCGGCGA 


ACGGAGGTGG 


CGACACAGGT 


1200 


CGCCGTCGTG 


GGCGTTCGGT 


TGGCCGCACT 


1260 


GAAGGTCCAG 


CTCAACGTGC 


CGTCACCGAA 


1320 


GCGCCCAAGG 


AA 




1362 



(2) INFORMATION FOR SEQ ID MO : 8 
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(i) 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1458 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: 

™ c SSSE ST" 0 ™ CGCCGACGCA CTC — 

TGGATGACGT GGCcSS^ JSSJS f^*** CCGGCTGGGG GAAGCCGGTC 
CTAAGGCCTT ScSSS ISI^IZ MCBaCMCa ^CGCGCCGAG CTGCGGACGG 
TGCGCGAGCG cSSgSg £SSS£ TAAAGCTGAG CTTGGCGGCC GTGACGGTAC 
TGATGGACCG SJSgS SSSSE SSSS GGCCGAGTCG ACCGGCGAGC 
CGAGGCGGTG GGCCGAGCGG TTCGcSS CCAGTATGAG CCGGGCTCGT 

CGCCCACGTT GATGAACTCT SSSf- TATTACGCAA CCTGGAATTC CTGCCGAATT 

see =s iii =£= ss — 

sees sssse i ii — sss sk 
ess ssS EF ~ s ™< -~ 

tacISSgS SSSSS cgSc^S ^ CCTGCS ^cgtcgaa cgcaacggcc 

TGTTCGACGC CATCTG*cS Ic^crfJS AGATCGTC3C GCGGATGCCC GCCGCCGAGC 
ACACGATCAA SSS^ CCgSgSS TCCCGGGCTG OTGTTTCTCG 

GCGGGGAGGT CCCACTGCTG Cr^IS™ G CATCGAGGCG ACCAACCCGT 

GGATGCTCGC CG^SS <^TAATCT CGGCTCGATC AACCTCGCCC 

3S5S ~ ™ ~ SS5E S= 

see ssS ~ ™ ™ sss 

sss ssr -™ — — ™« ssss sn 



C2) INFORMATION FOR SEQ ID NO : 9 : 

v'i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 862 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

S™ S£2S GTCGCCCGCT ACCTACCGAG ATCTA - GGc 

TCATC-CCT- rrrrrfSS GGCAiCGCGG TCGTCGTAGT CGGGATCGCG GTGGCCATCS 

SSS cS^c SccgS CCAAACCGGT cagcgccgac aagccgJS 

AAGGTAACGC cScGcSS ScS^ 0 ^^ CCAGCCGG <* GGGCAAACCG 
CCGCGGTGCA SgCcSg GtSSS GCCAAAACCC ^AGACACGC ACGCCCACCG 
CCGTCAAAGG SgSSS SSSJS JiS""™ TCGACGCTGG 

sss? ~ ™r -™ ™~ "™ 

c=™ S ™ S ££££ =s 



60 
120 
180 
240 
300 
360 
420 
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1140 
1200 
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1453 
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SSSS SSSS SS"™ " e " ooue SGaraMcl «« 

ATCAGCCGCC GCCkSto SSSJS rS^" "CAT^GA 7J0 

862 

(2) INFORMATION FOR SEQ ID NO: 10: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 622 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

SS££ SSS SSSS G T CTGGGTG «««™ 

GTGCCGAAGG GCctSJS? cSSJSi ^CCSSM GTGCTGCCGC GAACGCTGGA 
TTGGTTGCCG cSSSSS ^ CAAGGTC GACGACCGCC CGATCAACAG CGCGGACGCG 
SScISS SSSSE ff CGCCACGG TGGCGCTAAC CTTTCAGGAT 

TCGCCGCGCA GT^SSSg JSSSS M0Ce6l «« GTGATGAAGG 

TGGTGGTTGG CCGGGCA^ ™ ^ CGTGCGGAGT 

ACCACAGCGT rr-r^ GTCGTCGTCG TTGACGATCG CACGGCGCAC GGCGATGAAG 

?ggSS Scg^cSc gaS™" tcaccgaggc cgggtttgtt gtcgacESS 
gcsgggtSI £££££ ESSS GCTGAACACA gcggtgatcg 

cggaagccac ccsngacatt ct ggaccggngt gacgnctcgc gatgtcaccc 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1200 base oairs 

(B) type: nucleic acid" 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO.-ll: 

accaacaI? c£c1-5£ GGCCGCCGGC ^cactggtct TGACAGCATG CGGCGGTGGC 
SStS ac^SSS SSS"** acgtctgggt cggtgcactg CGGCGGCAAG 

«c££S gSSgcS SS^ GCA CCATGGAGCA GTTCGTCTAT 

GGGGTGaS SSSSa 3 ACGCCAACGG GTCCGGTGCC 

CCGTCGAcS SSSS£ C^TclScr GCT ^ATGT CCCGTTGAAT 

CCGACGGTGT tcgtt™^ = cggt cggcg GAGCGGTGCG GTTCCCCGGC ATGGGACCTG 

ct^SaT ccIIScSc caagaSS TACAATATCA aoggcgtgag cacgctgaat 

CAGATCCAAG cS-SSS f^™ 0 AACGGCACCA TCACCGTGTG GAATGATCCA 

^SgcSS ag^S C cggcaccgac ctgccgccaa CACCGATTAG CGTTATCTTC 

SgSSS gXS^Sc SSSSS ^ ccagaaat acctcgacgg tctatccaag 

GGGAACAACG GAaIStISc £££££ SSS?** GCGTCGGCGT CGGCGCCAGC 
TGGTCGTTTG cacrrtm^ ^TACTGCAG ACGACCGACG GGTCGATCAC CTACAACGAG 

ga?cSSS cgScaISJ SSS^ ATGGCCCAGA ^CACGTC GGCGGGTCCG 
ggacaaggS aSIcc^? tIgtS* TCGCCGGGGC —^tcatg 

TCTT\CCCGA Tr-c^r rarvT TCGTCGTTCT ACAGACCCAC CCAGCCTGGC 

acSgSSg StaacgIc g^ocaa SSSS S^ TACM GGATGCGACG 

iaiGlAA GCCGCGATTG GTCCAGGCCA AGAAGGCCTG 



SO 
120 
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(i) 



SEQUENCE CHARACTERISTICS * 

(A) LENGTH: 1155 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 12: 

CGTTTCTGCA ACA^G^C G^ScSr GTTCGACAAG ACCGGGCATC 

GGTTGCTCCA AGCGGtSc- -C^S- ACGTCACCCG GCTCAAGGTC ACCGTCGACG 
CCGCGACCGG CCg££S£ TCG^S f^* 3 * CXa "°^ CAGACGATCG 
ACGCGGGCCG GCGGATCCGG GaSc~™ CCA ^CTGCA GAACATCCCG ATCCGCACCG 
CGGCCGACTA CaSIgATC 2££SS ******* CGGTTACGCC GAGTTGATGA 
TCATCGAGGC GTTcSScC SSSESJ TCATGGGGCA ^TGTCCGGG GACGAGGGCC 
GTGTGCCCAT CGaSSc SSS££ SSS?*™ CGTCGCGTCC CGGGTGTTCG 
GGCTGGTTTA CGGGTTGAGC JSJSSI 3^ GGTCAAGGCG ATGTCCTACG 

AAGCCAACGA GCAGATGGAC GCG^CG ° rraA *»*«= TCCACCGAGG 

GC3CCGTAGT CGAGCGGGC CGoSSJ S^"^ CGGGGTGCGC GACTACCTGC 
GCTACCTGCC CGAG~5£c SSSaS GCTACACCTC GACGGTGCTG GGCCGTCGCC 
C3CTGAACGC GCC^C^G GTCAAGTGCG GGAGGCCGCC GAGCGGGCGG 

TCGACAAGGC GCTCAACGAG CCGACATCAT CAAGGTGGCC ATGATCCAGG 

AGCTGCTGTT SaA^GCC SSSS GCTGCTCCAG GTCCACGACG 

AGATGGGCGG CGCTTaIc~ £SSS£ CGAGGCCCTG GTGCGCGACA 

GCTGGGACGC GGCGGCGCAC SSJSS S^ 80 *** GTC3GTGG <* TACGGCCGCA 
TTTCCGCCCT GAg£caSc GCGT ^TCT GGGGCGGGAA TTCGGCGATT 

CGAGTAGCC- CGTCA * CGGGACCGAG TTTGTCCAGC GTGTACCCGT 



(2) INFORMATION ?OR SEQ ID NO : 13 : 

(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH : 1771 base nai> 
!B) TYPE: nucleic acid' 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



s 



1080 



SSSSS ££££ ?£S2£ f 3 ™ 0 " 0 «*»»» 

C2) INFORMATION FOR SEQ ID NO : 12 : 



60 
12 0 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1155 



(XI) SEQUENCE DESCRIPTION : SEQ ID NO: 13: 

SScIS SSSS mzmT mum 

ATCGAAGACA CcSccSg CGgS^Sp 2? JTGCTCA ACC ^GGCGG ATTGCTGCGC 
GGCCGTCGGA TGCCGaSS gSg^S CGA ^GT GCTGCTCCCC 

ATCGAGAACT CTCGGGG^C SSSSS '^^CG CTGGCGCTCG GAGCACGGAC 
ACCTAGTTGT gSSSSS SSSSS SI"™ ^"TCMTC CACGCGCGCA 
OGCCCGAGTA GTGGGCcSg £SJ£S SL^^ AGTCCAC ^A TGGCCAAGTT 

s ™s -ss ssss 

™. ™. ---- s ~™ „™ 



60 
12 0 
180 
240 
300 
360 
420 
480 
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GACCATGACG 
CGGCGCGGTG 
CGGGTTCAAC 
AAGCATCCCC 
GCCCAGTGTC 
CATTCTGTCT 
GCCTCCCCTG 
ACCCTTCACG 
CGTCTCCGGG 
GGTGCTGGCG 
CGCTCTCAAC 
CGCCATTCAG 
GAACGCTCAA 
TGCGCAGAGC 
CGCCGACGAG 
CAATGACAAA 
GAACGCTGGA 
CGCGGACGCG 
CTTTCAGGAT 
GTGATGAAGG 



CCCCCTCCTG 
ACGATAGCGG 
CGGGCACCCG 
GCAGCAAACA 
GTCATGTTGG 
GCCGAGGGGC 
GGCAGTCCGC 
GTGGTGGGGG 
CTCACCCCGA 
ATCGGGTCGC 
CGTCCAGTGT 
ACCGACGCCG 
CTCGTCGGAG 
GGCTCGATCG 
TTGATCAGCA 
GACACCCCGG 
GTGCCGAAGG 



TTGGTTGCCG 
CCCTCGGGCG 
TCGCCGCGCA 



GGATGGTTCG 
TGGTGTCCGC 
CCGGCCCCAG 
TGCCGCCGGG 
AAACCGATCT 
TGATC7TGAC 
CGCCGAAAAC 
CTGACCCCAC 
TCTCCCTGGG 
CGCTCGGTTT 
CGACGACCGG 
CGATCAACCC 
TCAACTCGGC 
GTCTCGGTTT 
CCGGCAAGGC 
GCGCCAAGAT 
GCGTCGTTGT 
CCGTGCGGTC 
GTAGCCGCAC 
GTGTTCAAAG 



CCAACGCCCT 
CGGCATCGGC 
CGGCGGCCCA 
GTCGGTCGAA 
GGGCCGCCAG 
CAACAACCAC 
GACGGTAACC 
CAGTGATATC 
TTCCTCCTCG 
GGAGGGCACC 
CGAGGCCGGC 
CGGTAACTCC 
CATTGCCACG 
TGCGATTCCA 
GTCACATGCC 
CGTCGAAGTA 
CACCAAGGTC 
CAAAGCGCCG 
AGTGCAAGTC 
C 



CGTGCAGGCA 
GGCGCGGCCG 
GTGGCTGCCA 
CAGGTGGCGG 
TCGGAGGAGG 
GTGATCGCGG 
TTCTCTGACG 
GCCGTCGTCC 
GACCTGAGGG 
GTGACCACGG 
AACCAGAACA 
GGGGGCGCGC 
CTGGGCGCGG 
GTCGACCAGG 
TCCCTGGGTG 
GTGGCCGGTG 
GACGACCGCC 
GGCGCCACGG 
ACCCTCGGCA 



TGTTGGCCAT 
CATCCCTGGT 
GCGCGGCGCC 
CCAAGGTGGT 
GCTCCGGCAT 
CGGCCGCCAA 
GGCGGACCGC 
GTGTTCAGGG 
TCGGTCAGCC 
GGATCGTCAG 
CCGTGCTGGA 
TGGTGAACAT 
ACTCAGCCGA 
CCAAGCGCAT 
TGCAGGTGAC 
GTGCTGCCGC 
CGATCAACAG 
TGGCGCTAAC 
AGGCGGAGCA 



(2) 



INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1058 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO: 14: 



CTCCACCGCG 

ACGAGGATCC 

AGCCCGGCGA 

CCGGCGACGG 

ATCCAATCAA 

TGAATGATGG 

CGTTGTGGCT 

TGAGCCCGAC 

CAAAAGGGTT 

TGGGTATTAC 

GCACCTACAA 

AACTGTTCGA 

TCGATCCTGC 

GTACCGAAGT 

CTGTCAAGAT 

AGGACGGCTC 

TCACGCAGTC 

GTTGNTCGAA 



GTGGCGGCCG 
GACGTCGCAG 
CGGCGAGCGC 
CGAGCGCCGG 
CCTGCATTCG 
AAAACGGGCG 
ATCAGGATGT 
GGCGTCCGAC 
GACCAGCGTG 
CAGTGCCGAT 
CGACGAGCAG 
CGACTGGAGC 
CGCTGGGGTG 
GATAGACGGA 
GCTTGATCCT 
GCACCACCTC 
GAAATGGAAC 
ACGCCCTTGT 



CTCTAGAACT 

GTTGTCGAAC 

CGGAATGGCG 

AATGGCGCGA 

GCCTGCGGGC 

GTGACGTCCG 

TCTTCGCCGA 

CCCGCGCTCC 

CACGTAGCGG 

GTCGACGTCC 

GGTGTCCCGT 

AATCTCGGCT 

ACGCAGCTGC 

ATTTCGACCA 

GGCGCCAAGA 

GTCCGAGCGA 

GAACCCGTCA 

GAACGGTGTC 



AGTGGATCCC 

CCGCCGCCGC 

CGAGTGAGGA 

GTGAGGAGGC 

CCATTTGACA 

CTGTTCTGGT 

AACCTGATGC 

TCGCCGAGAT 

TCCGAACAAC 

GGGCCAATCC 

TTCGGGTACA 

CGATTTCTGA 

TGTCCGGTGT 

CCAAAATCAC 

GTGCAAGGCC 

GCATCGACCT 

ACGTCGACTA 

AACGGNAC 



CCGGGCTGCA 

GGAAGTATCG 

GGCGGGCAAT 

GGGCAGTCAT 

ATCGAGGTAG 

GGTGCTAGGT 

CGAGGAACAG 

CAGGCAGTCG 

CGGGAAAGTC 

GCTCGCGGCA 

AGGCGACAAC 

ACTGTCAACT 

CACGAACCTC 

CGGGACCATC 

GGCGACCGTG 

CGGATCCGGG 

GGCCGAAGTT 



GGAATTCGGC 

GTCCATGCCT 

TTGGCGGGGC 

GCCCAGCGTG 

TGAGCGCAAA 

GCCTGCCTGG 

GGTGTTCCCG 

CTTGATGCGA 

GACAGCTTGC 

AAGGGCGTAT 

ATCTCGGTGA 

TCACGCGTGC 

CAAGCGCAAG 

CCCGCGAGCT 

TGGATTGCCC 

TCGATTCAGC 

GCGTCGACGC 



C2) 



INFORMATION FOR SEQ ID NO: 15; 



660 
720 
780 
840 
900 
960 
1020 
1080 
114 0 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1771 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660- 
720 
780 
840 
900 
960 
1020 
1058 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 542 base Dairs 

(B) TYPE: nucleic acid 
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(xi) 



(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

SEQUENCE DESCRIPTION: SEQ ID NO : 15 : 



ScS2 rSSf TGA TCGACATCAT cgggaccagc CCCACATCCT gggaacaggc 
gcggtccagc gggcgcggga tagcgtcgat gacatccgcg tcgctoggS 

gcgcggccca screed tcSSS £2S 

cSSS SSSS CGACCCGAAC SS?SSS 

SgScSa SSSf CCCGAGTTGC CCGAGGAAAC GTGCTGCCAG GCCGGTAGGA 
AGCGTCCGTA GGCGGCGGTG CTGACCGGCT CTGCCTGCGC CCTCAGTGCG GCCaIcgSc" 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 913 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY : linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:16: 

CaIStScC TGCCGCCGTC GCCGATCAGC TGCGCATCGC 

CcSSrS ^S^ GC CGGTGGGG CC GGGGCCGCCG ATGCCACCGC 
GCcJtCGcS CcSc^S SSS*** ATACAGCACC CCGCCGGGGG CACCGTTACC 
CAAGCcSc GCcScACcI ^S'c— «ACCOCCOC 
CCGAACAGC" AMrr*rr«T^ ZI^ C=TT TTCCGCCCGC CCCGCCGGCG CCGCCAATTG 
GCC^SSc C^S™ "?™ GCC CCGCCGCC ^ TAACGGCGCT GCCGGGCGCC 
GTTTGCC3CC £££££ OTCGGTGG ~ CCGCCGTTAC CGGCGCCGCC 

CACCGAAACA AcSS^Ar ^S^^ 2 AGACCCG CCG GGGCCACCAT TGCCGCCGGG 
TCACCGC^S S^r™ GGTGC ^CC3 GCCCCGCCGT TTGCCGCCAT CACCGGCCAT 
cSS^S G SSrS; AMCTraTO AACCCGGTAC CGCCAGCGCG GCCCCTATTG 
CGG^CCGCC f GCCGGCGC CGCCAACG ^ CAAAAGCCCG GGGTTGCCAC 
™™ G TCCCGCCGA TCCCCCCGTT GCCGCCGGTG CCGCCGCCAT 

CGGCGCCGCC ScSS^ CGCSGGTTCC ° GCGereBCG ^ GGG ^ 

-GCC-C-G— rrv^™ 0 . AGC - ACCCCC CGGTGGCGCC GTTGCCGCCA TTGCCGCCAT 
CGCCGGC^C CGr «*«*TTCC CGCCGCCACC GCCGGNTTGG CCGCCGgS 



60 
120 
180 
240 
300 
360 
420 
480 
540 
542 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 72 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:17 : 
GACTACGTTG GTGTAGAAAA ^rrT^^^.- 

TAGCTACCCC gJSSJJJ TO SSSf^ 77 CAATTTCTGA 

^ACACAGGAG CAC^AT GAGCAATTCG CGCCGCCGCT CACTCAGGTG x 



913 



SO 
20 
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™ iii «~ SS5= SSS5S s 

™£««««T« CCC AAGTGGCGCC ACAGGTGGTC AACATCAACA CCAAACTGflr 

==1=11 
====== 

S SSSJS ss * 

?£S2 SSS S£ o1= -~ SSS 

£££££ SS 8 ™* 

C1GCCOTGAT tSSb^i ^™ ^CCSCCGG CCGGCCAATT GGA1TGGCGC 

™ EE Eil EiF ™ ssss 

isS^i £££S So™ ™£ ssss 

S^cS ™' KC3a GGTATCCGCA TCATCACC3A CCTGG,££ 

TaSSS ££52£ SSSIT CMaOICK """ra AGACGGACCG 
T-CCT-oIS "^GACACC AGCOAGCGCT ACACCGACGC CCGGATCATC 

gcaccgattc ttcgatcctg tccgccgaca gi^ctaSg 



(2) INFORMATION FOR SEQ ID NO: 18: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1482 base oairs 

(B) TYPE: nucleic acid" 

(C) STRAND EDNESS : single 
(DJ TOPOLOGY: linear 

Cxi) SEQUENCE DESCRIPTION: SEQ ID NO:18: 

SPSSS ^ GAACAGG ^GTTCCCGT GAGCCCGACG GCGTCCGACC 

SSSS SSSS SS?. C SS^ ACCAGCGTGC 

Trr , r - m „_„ GGAAAGTCo ACAGCTTGCT GGGTATTACC AGTGCCGATG 

g?g?cccg£ SSSSS CTCGCGGCAA ^cgtatg cacctacaac gacgagSS 
Sc-cSS? 8GCGAcm tctcggtgaa actgttcgac gactggagca 

cSgcIg? £££££ ^ TCAACTT ^gcgtgct cgatcctgcc gctggggtga 
SSgaSc £2S££ ?^f CCTCC ^^g taccgaagtg atagacggaa 

ifllXr CAAAATCACC GGGACCATCC. CCGCGAGCTC TGTCAAGATG CTTGATCCTG 

iii lii ~s sss ssss sss 
s= iii iil ™ — ssss 

CGGTCTTTGA GCCGGTAGCT gScSS SSSS ZSS*""* 

4UG AGoGv^GACGA CTTCAGCATG GTGGACGAGG 



180- 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1872 



60 
120 
180 
240 
300 
360 
420 
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540 
600 
660 
720 
780 
840 
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S EE ~ sss s= sss 
=5 EE 5E = ~ sss 
s=s sssss ™ EE =£ ™ 

TTTGAGGCCA rrarraT^ ™V*^^ CAGGGTGCCC AGATGTGCGA TGGTGTCGCG 

£SS£ SSaSS SSSSi 0ACTTCC(aA CCGGGAAGCG 

GCAGGCGGCC AGGTATTCTT ^ GCTGACACTT CCCGCTGCAG 

GGACACTGAC TCACGcIS t'SaIcS SSSSS gT"^ " 4 ° 



(2) 



INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 876 base oair 3 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



900 
960 
1020 

loao 

1140 
1200 
1260 
1320 
1380 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:19: 

SSSS £SS£ Z^tr* gaccagatgg ctcgaggg ^ 

CGGTCaSS SgCtS SSSJ ^ GACC&GT GAGGGC ^ AACACGCCGA 
CGCCTACGAA MCGO~5S CCCGGCGGTG GTTGCCTACG ACCCGGCCTT 

GgSScSc" "ttS™ JSE?*** 6 CGGACTGGCC AGGATGTGCG GGGAGAACCC 
ggaSa£?c SSSS"* CMCGMC » TACGTGCAGC CGCCGGAGCC 

GCAACGoS TrllrZ^ CWTATCACQ CGGCCACCGA 

GGCAgSSg £g^Sc~ S™_f C CTCCGGGGTA ^GATGCCCG CGGCGCTGCG 
TTGGGGCGAG £a5£™ TGTCGCCGC ~ GAGGTGTGGT CGGTGACCAG 

-CGC^rrrr ^™"" ,_J ACGGGGTGGT CATCGAGACC GAGAAGCTCC GCCACCCGA 

cgSt?c G acgtgacgag agcgctggag aatgctcggg gcccggt^a? 

2S££ ^SgSS GCGCGG7c ~ -agcagatc cgaccgtggg tgccgggSJ 

^tSS ZTcl^rrr !RTTCBa « ACTCGGCCCG CCGGTCGTCG 

SgAcSJg £S£SS SSSST WBCaCBOr "TOOCafiQG GTTGGCCGGG 
ATTCGACGAA ££S£ SS££ ST ™ 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1021 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 
SESSS SSSSJ GAGACAAAAT TCCACGCGTT AATGCAGGAA 

SSSSS SSJSSE SSSSJ ™ ATGTCG CGATCGCGGT 

CATGCAATGA TGCtSgcI SSSS SSSS SSSSS SESE 

SSSS SSSS SSSS SSSES ™- -™ 

gatttcctcg gcgagcagtt cSS£5 ££££ ™S SSSS 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



876 



60 
120 
180 
240 
300 
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420 
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=ES S= 5= j~ SS SSSSS 

ESSS ~ ~ ™ ™ c ssss 

AACAACGTCG cSSSS GGCCGCACTG CTGGTCAGGT ATCGGGGGGT CTTGGCGAGC 
SS^Scc SSSn ^f AGCCCGC CGGAT ^GCA GACCGGGGGG GCGAAAACGA 
SgS^SS S^SSS GGGGGGTGCG GGAATACCGA ACCGGTGTAG 

GAGCGCCAGC AGTTGTTTTT CCACCAGCGA AGCGTTTTCG GGTCATCGGN GGCNnSS 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 321 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

ACCGC^S J£5^ AAGATGGTGA AATCGATCGC CGCAGGTCTG 

CCgScS GCCGGTGTGA CTTCGATCAT GGCTGGcSn 

TCCGCCCCTG SSS^S SSf™ CACTGCCGTT GGACCCGGNA 

ccc^cgS? S£SS° TGGACCAGNC tgctcaacag NCTCGNCGAT 

GGNGNGnSc SSSJ f**^ CTGGTCGAGG GNGGNATCGG NGGNANCGAG 



(2) 



INFORMATION FOR SEQ ID NO: 22: 



(ij 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 73 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 
cS^GAcS GcSS?" ^mTT GGGNGCGGGT GGTTAACCCG CTCGGCCAGC 

cctSSS SSSS 222°™ CGCGCTGGAG CTCCAGGCGC 

ACGCGATGAC CCcStcS^ ™^ AGCCGTTGNA GACCGGGATC AAGGCGATTG 
GCAAAAACCG SJSSS? ^^^^ CGCAAGACCG 

GGTGGATCCC AAGaSSS S2S^ T C ^CAAACCA GCGGGAAGAA CTGGGAGTCC 

cttaccaSg ^ GCAGG tgc gcttgtg TATACGTTGG ccatcgggca AGAAGGGGAA 



(2] 



INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 352 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1021 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

TG^AcS SSS™ OWTCCOCT GGCGGTGGTG GATCAGCAAC 

T^CGGC SSSS?* ™CTCAGGC AGCC3CTGTG CCGGTGG^t 

5££25£ £££££ F 500 ^ TAGCCGAGAT CAAGGCGGGC GAATCGGTGC 

g^gSggx xScg^caIc gSSSI ? CAGCTGGCT CGCCAGTGGG 

TTGACGACGA NCCATATcS SSS K£5 Jf™ "* 

(2) INFORMATION FOR SEQ ID NO: 24: 



(i) SEQUENCE CHARACTERISTICS: 

(A) IiENGTH: 726 base nairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: 3EQ ID NO: 24: 
™o TCGACCAGCS SCTGGCGATA ATCGACGAAG TGATCAAGCC 

SSSS ™S SSSSJ AGTTCTCTGG TA ™ 

GCCGCACGCT CATGCTGGCr rSJSS^I CGTACCGTCA TCGCATGTAC CGGTTCGCGT 
GCGCGCAGTC SSgCcIS TGGCCACGGG TGTGGCGGGT CTCGGGGTCG 

CTTn- GAC-r rvv™^ ACCGCGCCGG TGCCCGACTA CTACTGGTGC CCGGGGCAGC 
gSaCAGCGA ££S£5 ATCCCTACAC CTGCCATGAC GAclrlScc 

TGCTTGACGA ScrJ^ CACAGCCGCG ACTACCCCGG ACCCATCCTC GAAGGTCCCG 
CgS2ccG A GC~™f GC ^CGCCGC CCCCGGCTGC CGGTGGCGGC GCATAGCGC- 
SSJPS £££S5° CGAATACGCG TATAAACCCG GGCGTGCCCC CGGCAAGCTA 
A^TAGgS S^rl GTGCCGATGG ATCGCGCCGT CCGATGaSg 

SctlS- S^Sf CAACC3CTTG GAGGACGCTT GAAGGGAACC TGTCATGAAC 
aIcgtg CC - CCACCAT ^catcgac AAGGTTGTTA CCCGCACACC CGTTCGCcIS 

(2) INFORMATION FOR SEQ ID NO: 25: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 530 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

SEQUENCE DESCRIPTION: SEQ ID NO:2S: 



(xi) 



SSc^C S5 SSSS^ CGCCTATGCG rTOATGCAGO CGACCGGGAT 
CTGCCCGATG GCsiS^ ^T mCCT aXa ™^ CGACCTTTTG ACCAGCCGGG 
aSSmS SjISS^ SJSS™ CGCCGa » ! " TGTGCACCTG ATCAACCCGA 

esse ™1 i~ sss sss ssss 



60 
120 
180 
240 
300 
352 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
726 



60 
120 
180 
240 
300 
360 
420 
480 
540 
580 



(2) INFORMATION FOR SEQ ID NO: 26: 
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60 
120 
160 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 160 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: 

AACGGAGGCG CCGGGGGTTT TGGCGGGGCC GGGGCGGTCG GCGGCAACGG CGGGGCCGGC 
GGTACCGCCG GGTTGTTCGG TGTCGGCGGG GCCGGTGGGG CCGGAGGCAA CGGCATCGCC 
GGTGTCACGG GTACGTCGGC CAGCACACCG GGTGGATCCG 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 272 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

GACACCGATA CGATGGTGAT GTACGCCAAC GTTGTCGACA CGCTCGAGGC GTTCACGATC 60 
CAGCGCACAC CCGACGGCGT GACCATCGGC GATGCGGCCC CGTTCGCGGA GGCGGCTGCC 120 
AAGGCGATGG GAATCGACAA GCTGCGGGTA ATTCATACCG GAATGGACCC CGTCGTCGCT 
GAACGCGAAC AGTGGGACGA CGGCAACAAC ACGTTGGCGT TGGCGCCCGG TGTCGTTGTC 
GCCTACGAGC GCAACGTACA GACCAACGCC CG 



180 
240 
272 



(2) INFORMATION FOR SEQ ID NO: 28: 

Ci) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 317 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: 

^^f TG ^CTCGGAC TATCTGCGCA CGGTGACGCA GCGCGACGTG CGCGAGCTGA SO 

GCAGACGGAT CGCCTGCCGC GGTTCATGCG CTACCTGGCC GCTATCACCG 120 

CGCAGGAGCT GAACGTGGCC GAAGCGGCGC GGGTCATCGG GGTCGACGCG GGGACGATCC 1B a 

GTTCGGATCT GGCGTGGTTC GAGACGGTCT ATCTGGTACA TCGCCTGCCC GCCTGGTCGC 240 

^IZT C CGCGAAGATC AAGAAGCGGT CAAAGATCCA CGTCGTCGAC AGTGGCTTCG 300 
CGGCCTGGTT GCGCGGG 

317 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 182 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
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(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO: 32: 



SScSS f^^ 0 ^ TGAGCAATCA CACCTACCGA GTGATCGAGA 

S5S££ ?r~r,r.?r.t Q GGCf ^GKia CGGCAATCCA GGGCGGTCTG GCCCGAGCTG 

GACT0CTTCG ^TACAGTC AATTCGAGGC CACCTGGTCG 
ACGGAGCoG i CCGCACTTC CAGGTGACTA TGAAAGTCGG CTTCCGCTGG AGGATTCCTG 
£££££ Scr^ MCWM « e CATCATTAAG CGACTtScI £££££ 

GACGCGC.C- AAACGCGGTT cagccgacgg tggctccgcc gaggcgctgc ctccaaaatc 
cctgcgacaa ttcgtcggcg gcgcctacaa ggaagtcggt gctgaattcg 



60 
120 
180 
240 



GATCGTGGAG CTGTCGATGA ACAGCGTTGC CGGACGCGCG GCGGCCAGCA CGTCGGTGTA SO 
GCAGCGCCGG ACCACCTCGC CGGTGGGCAG CATGGTGATG ACCACGTCGG CCTC^S 120 
CG CTTCGGGC GCGCTACGAA ACACCGCGAC ACCGTGCGCG GCGGCGCCGG Sc^St Ho 

182 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 08 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

AGGTGGTCGA ^CGAAAGTC TGGGCGCCTG CGAAGCGGGT 
C^GC.TTCAC oAG^wGAAGA CACGCCTGTC CGAGCTGCTG CGGCTCGTCT ACGGCGGGCA 
GAGGTTGAGA TTGCCCGCCG CGGCGAGCCG GTAGCAAAGC TTGTGCCgS £££££ 
GAGACTCGGC GGTTAGGCAT TGACCATGGC GTGTACCGCG TGCCCGAcS SS^SJ 
SSST ACGACGTGCT CGAAC — CACCGGTGAA GCGCTACCTC IS^CCC 300 

308 

(2) INFORMATION FOR SEQ ID NO: 31: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 267 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 31: 

GCAACTCACG TGGATGATGG TCGGCAGCGG CATTGAGGAC GGAGAGAATC 
GgSc^C SSSS 0 ^GTGACCGG CCGTAGAGGG CTCCcSS 

SSSg ^SSSS tgccgctggc cggtaagagc GGGTAAAAGA ATGTGAGGGG 

2S£S£ SSc STCi * 0 """ ! " °— «*» «°««» »• 

* 267 
(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1539 base paxrs 

(B) TYPE: nucleic acid' 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 254 base "pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34 
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SSSSS I!!^" GC ^CCGGACGA AGCGGTGCTC GACGTCGGCT GCGGCTCGGG 
UlSrZ? CCGGCTATCT GGACGCTACG CCGGCTTCGA 

cSSSr CGTGGTGCCA GGAGCACATC ACCTCGGCGC ACCCCAACTT 

CCAGTTCGAG GTCTCCGACA TCTACAACTC GCTGTACAAC CCGAAAGGGA AATACCAGTC 

gScSSc' SSSf ATCCGGATGC GTCGTrCGAT GT ^^ ™^ 

GTTCACCCAC ATGTTTCCGC CGGACGTGGA GCACTATCTG GACGAGATCT CCCGCGTGCT 
SSSSSS °« C « roCC ^TGCACGTA CTTCTTGCTC AATGACGaS CgSSSS 
CATCGCGGAA GGAAAGAGTG CGCACAACTT CCAGCATGAG GGACCGGGTT ATCGGACAAT 
CCACAAGAAG CGGCCCGAAG AAGCAATCGG CTTGCCGGAG ACCTTCGTCA GGGATGTCTA 

IcI^cSa TGCACGAACC ATTGCACTAC 

cSSSt AG fT CCAGG ACATCGTCAT CGCGACCAAA ACCGCGAGCT AGGTCGGCAT 
CCGGGAAGCA TCGCGACACC GTGGCGCCGA GCGCCGCTGC CGGCAGGCCG ATTAGGCGGG 
CAGATTAGCC CGCCGCGGCT CCCGGCTCCG AGTACGGCG C CCCGAATGGC GtSSgS 

SSJSSS SSS^ TGGGCGGCGG CCTGCCGGAT CAGGTGGTAG SScSS 
AGCCTGCGTG ATCGGTCATC ACCAACGGTG ACAGCAGCCG GTTGTGCACC AGCGCGAACG 
r^rrl^Tr CTCCGGGTCT GTCCAGCCGA TCGAGCCGCC CAAGCcSS SSaIcC 
GTTGCCGATC «™»GT GATAGCCAAG ATGAAAATTT AAGGGCACCA 

gSa^ SSZ*™* AC " GC = GTC GGTTGCGGGT CAGGCCCGTG ACCAgSSc" 
GCGACAAGAA CCGTATGCCG TCGATCTCGC CTCGTGCCG 

U) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 851 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

^Ir^^Z GCGTGGATGA G.CGTCACCGC GGGGCAGGCC GAGCTGACCG CCGCCCAGGT 
SSS 80 " ACGAGACGGC ^ATGGGCTG ACGGTGCCCC CGCCGGTGAT 
SS^r CG Z GCTGAAC -SATGATTCT GATAGCGACC AACCTCTTGG GGCAAAACAC 
SSSSJ^ GCGGTCAACG AGGCCGAATA CGGC3AGATG TGGGCCCAAG ACGCCGCCGC 

gJcgcISg SSS"* cgacggcgac ^cgacggcg ACGTTGCTGC CGTTCGAGGA 

CGGGTGGGCT CCTCGAGCAG GCCGCCGCGG TCGAGGAGGC 
GCCGCSGCGA ACCAGTTGAT GAACAATGTG CCCCAGGCGC TGAAACAGTT 
ScSgS? !££Z£* CCACSCC ^ C TTCCAAGCTG GGTGGCCTGT GGAAGACGGT 
CTCGCCGCAT C3GTCGCCGA TCAGCAACAT GGTGTCGATG GCCAACAACC ACATGTCGAT 
f^S^™ TGACCAACAC CTTGAGCTCG ATGTTGAAGG gSt^C 

SSSSf* GCCCAGGCCG TG ^CGC GGCGCAAAAC GGGGTCCGGG CGATGaSc 
GCTGGGCAGC TCGCTGGGTT CTTCGGGTCT gggcggtggg gtggccgcca actwgqt? 
ggcggcctcg gtacggtatg gtcaccggga tggcggaaaa tatgcanaS c^StISg 

CCGGCGTAAG GTTTACCCCC GTTTTCTGGA TGCGGTGAAC SSSaIg 840 
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GATCGATCGG GCGGAAATTT GGACCAGATT CGCCTCCGGC GATAACCCAA TCAATCGAAC 
CTAGATTTAT TCCGTCCAGG GGCCCGAGTA ATGGCTCGCA GGAGAGg'aIc" C^S^l " 
CGGGCACCTG TCGTAGGTCC TCGATACGGC GGAAGGCGTC GACATTT^CC ^ 

g c ss^ ^ c cgagggc ™- — — gcaS sss^ a 1 :: 

254 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1227 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:35: 

cS~Scg SSSJS FZS™** GAAGTCGC ^ TTGGACCAGG AGGGACGGGA 
Sc^SS CGGTrCAGCC GGQGQGGTGC GCTGGATTGC GCTATAACCT 

TTTCTTCGAC GACCGGACGC TGGATGGTGA CCAAACCGCG GAGTTCGGTG GTGTCAGG-r 
GATCGTGGAC CGGATGAGCG CGCCGTATGT GGAAGGCGCG TCGATCGaS SSSSS 

GATTCGTTCA SSSSS £5**" ^SSS 
S^ScS SSSJSJ ff™ 1 * 00 ACCCCGCGGT GCGCAACACG TACGAGCACA 
GG^GCCgS AAMGC * ACT ^CGATGCC TTGCACCTGA CCGCGTGGCG 

SSf^ GGCAGGTGTC ACCTGCATGG TGAACAGCAC CTGGGCCTGA TATTGCGAC" 
cS™G Sr™ TC « flTOC " CGACCTGGGA GAACTGCtS cSScIt 
CGC-GCTCAG CTTGGCCAAG GCCTGATCGG AGCGCTTGTC GCGCACGCCG TCGTGGATAr 

rr^^ ATTGCSAACG ATGGTGTCCA CATCGCGGTT CTcSScG 
'gC^cS «^CCTCCG AGAATGTGCC TGCCGTGTTG SS5£ 

f^i^ GTATATGATC GCCGCCGTCA TAGCCGACAC CAGCGCGAGG GCTACCACAA 7 80 
SSSS? TGCCGTCGCT ^GGTAGGA CACCTGCGGC GGCACGCCGG 
^^S GC GCCG CGTCGT CTGCCGGTCC CGGGGCGAAG GCCGGTTCGG 
™c^^G ~ A GTCCAGGG CTTGGGGTTC GTGGGATGAG GGCTCGGGGT 

—C-AgI gS^gS C : 3ACACCGG ^CGGCGA GTGGGGACCG GGCATTGTGG 

C.AGG GTGGTGGACG GGACCAGCTG CTAGGGCGAC AACCGCCCGT CGCGTWr 

GGCAGCATCG GCAATCAGGT GAGCTCCCTA GGCAGGCTAG ScaS GCCG^S 

S^gga" 5^™° CSCGGCGCCG ATAATG — " g ^ ggg 

AC^AAGGACG GAGATTTTGT GACGATC 

(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 181 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:36: 



gScSS gggmct ™ acggcaacgg CGGGGCCGGC GGGGCCGGCG 

gSgnSS ^GGCGGCA ACGCCTGGTT GTTCGGGGCC GGCGGGTCCG 

GCGGNGC.GG CACCAATGGT GGNGTCGGCG GGTCCGGCGG ATTTGTCTAC GGCAACGGCG 

(2) INFORMATION FOR SEQ ID NO: 37: 
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(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 290 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

SSS? "™ «««». CGGTGTCGSC SCCCGGGCCC 

CKSGcSlC SSSS >G<!0C0GC = T CCGTQGGCG GGCGGCAATC 

CCCaS £££££ S£S S^?^ GG ° CS;ra!C ==MO«»CG 
OCCTC^ ££££ SS ATTCGCGGCG 

(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
GATCCAGTGG CATGGNGGGT GTCAGTGGAA GCAT 
(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 155 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
?ScGt5£ CGCCACCGGT CCCACCGTTA CCGAACAAGC 

£S£S S£S£ SSSS S£°~ ™ ma,a «: 

(2) INFORMATION FOR SEQ ID NO: 40: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:40: 
ATGGCGTTCA CGGGGCGCCG GGGACCGGGC AGCCCGGNGG GGCCGGGGGG TGG 
(2) INFORMATION FOR SEQ ID NO: 41: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 132 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:41: 

GATCCACCGC GGGTGCAGAC GGTGCCCGCG GCGCCACCCC GACCAGCGGC GGCAACGGCG 
GCACCGGCGG CAACGGCGCG AACGCCACCG TCGTCGGNGG GGCCGGCGGG GCCGGCGGCA 
AGGGCGGCAA CG 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 132 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

Cxi) SEQUENCE DESCRIPTION: SEQ ID NO:42: 

GATCGGCGGC CGGNACGGNC GGGGACGGCG GCAAGGGCGG NAACGGGGGC GCCGNAGCCA 
CCNGCCAAGA ATCCTCCGNG TCCNCCAATG GCGCGAATGG CGGACAGGGC GGCAACGGCG 
GCANCGGCGG CA 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 702 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:43: 

CGGCACGAGG ATCGGTACCC CGCGGCATCG GCAGCTGCCG ATTCGCCGGG TTTCCCCACC 
CGAGGAAAGC CGCTACCAGA TGGCGCTGCC GAAGTAGGGC GATCCGTTCG CGATGCCGGC 

ATGAACGGGC GGCATCAAAT TAGTGCAGGA ACCTTTCAGT TTAGCGACGA TAATGGCTAT 18 0 

AGCACTAAGG AGGATGATCC GATATGACGC AGTCGCAGAC CGTGACGGTG GATCAGCAAG 240 

AGATTTTGAA CAGGGCCAAC GAGGTGGAGG CCCCGATGGC GGACCCACCG ACTGATGTCC 3 00 

CCATCACACC GTGCGAACTC ACGGNGGNTA AAAACGCCGC CCAACAGNTG GTNTTGTCCG 36 0 

CCGACAACAT GCGGGAATAC CTGGCGGCCG GTGCCAAAGA GCGGCAGCGT CTGGCGACCT 420 

CGCTGCGCAA CGCGGCCAAG GNGTATGGCG AGGTTGATGA GGAGGCTGCG ACCGCGCTGG 48 0 

ACAACGACGG CGAAGGAACT GTGCAGGCAG AATCGGCCGG GGCCGTCGGA GGGGACAGTT 54 0 

CGGCCGAACT AACCGATACG CCGAGGGTGG CCACGGCCGG TGAACCCAAC TTCATGGATC 60 0 

TCAAAGAAGC GGCAAGGAAG CTCGAAACGG GCGACCAAGG CGCATCGCTC GCGCACTGNG 66 0 

GGGATGGGTG GAACACTTNC ACCCTGACGC TGCAAGGCGA CG 702 

(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 298 base pairs 
CB) TYPE: nucleic acid 
(C) STRANDEDNESS: single 



60 
12 0 



WO 99/42076 



PCT/US99/03268 



90 



60 
120 
180 



<D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

GAAGCCGCAG CGCTGTCGGG CGACGTGGCG GTCAAAGCGG CATCGCTCGG TGGCGGTGGA 
GGCGGCGGGG TGCCGTCGGC GCCGTTGGGA TCCGCGATCG GGGGCGCCGA ATCGGTGCGG 
CCCGCTGGCG CTGGTGACAT TGCCGGCTTA GGCCAGGGAA GGGCCGGCGG CGGCGCCGCG 
CTGGGCGGCG GTGGCATGGG AATGCCGATG GGTGCCGCGC ATCAGGGACA AGGGGGCGCC 240 
AAGTCCAAGG GTTCTCAGCA GGAAGACGAG GCGCTCTACA CCGAGGATCC TCGTGCCG 2 SB 

(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1058 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 45: 
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CGGCACGAGG ATCGAATCGC GTCGCCGGGA GCACAGCGTC GCACTGCACC AGTGGAGGAG 
CCATGACCTA CTCGCCGGGT AACCCCGGAT ACCCGCAAGC GCAGCCCGCA GGCTCCTACG 
GAGGCGTCAC ACCCTCGTTC GCCCACGCCG ATGAGGGTGC GAGCAAGCTA CCGATGTACC 
TGAACATCGC GGTGGCAGTG CTCGGTCTGG CTGCGTACTT CGCCAGCTTC GGCCCAATGT 

TCACCCTCAG TACCGAACTC GGGGGGGGTG ATGGCGCAGT GTCCGGTGAC ACTGGGCTGC 300 

CGGTCGGGGT GGCTCTGCTG GCTGCGCTGC TTGCCGGGGT GGTTCTGGTG CCTAAGGCCA 36 0 

AGAGCCATGT GACGGTAGTT GCGGTGCTCG GGGTACTCGG CGTATTTCTG ATGGTCTCGG 420 

CGACGTTTAA CAAGCCCAGC GCCTATTCGA CCGGTTGGGC ATTGTGGGTT GTGTTGGCTT 480 

TCATCGTGTT CCAGGCGGTT GCGGCAGTCC TGGCGCTCTT GGTGGAGACC GGCGCTATCA 540 

CCGCGCCGGC GCCGCGGCCC AAGTTCGACC CGTATGGACA GTACGGGCGG TACGGGCAGT 600 

ACGGGCAGTA CGGGGTGCAG CCGGGTGGGT ACTACGGTCA GCAGGGTGCT CAGCAGGCCG 660 

CGGGACTGCA GTCGCCCGGC CCGCAGCAGT CTCCGCAGCC TCCCGGATAT GGGTCGCAGT 72 0 

ACGGCGGCTA TTCGTCCAGT CCGAGCCAAT CGGGCAGTGG ATACACTGCT CAGCCCCCGG 78 0 

CCCAGCCGCC GGCGCAGTCC GGGTCGCAAC AATCGCACCA GGGCCCATCC ACGCCACCTA 84 0 

CCGGCTTTCC GAGCTTCAGC CCACCACCAC CGGTCAGTGC CGGGACGGGG TCGCAGGCTG 900 

GTTCGGCTCC AGTCAACTAT TCAAACCCCA GCGGGGGCGA GCAGTCGTCG TCCCCCGGGG 96 0 

GGGCGCCGGT CTAACCGGGC GTTCCCGCGT CCGGTCGCGC GTGTGCGCGA AGAGTGAACA 1020 

GGGTGTCAGC AAGCGCGGAC GATCCTCGTG CCGAATTC 10 5 8 

(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 327 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:46: 



CGGCACGAGA GACCGATGCC GCTACCCTCG CGCAGGAGGC AGGTAATTTC GAGCGGATCT 60 

CCGGCGACCT GAAAACCCAG ATCGACCAGG TGGAGTCGAC GGCAGGTTCG TTGCAGGGCC 120 

AGTGGCGCGG CGCGGCGGGG ACGGCCGCCC AGGCCGCGGT GGTGCGCTTC CAAGAAGCAG 180 

CCAATAAGCA GAAGCAGGAA CTCGACGAGA TCTCGACGAA TATTCGTCAG GCCGGCGTCC 24 0 

AATACTCGAG GGCCGACGAG GAGCAGCAGC AGGCGCTGTC CTCGCAAATG GGCTTCTGAC 300 

CCGCTAATAC GAAAAGAAAC GGAGCAA 
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(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 170 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

GATGGCGTTG ^GAACGTGA CCGATTCTGT ACCGCCGTCG TTGAGATCAA 60 
SSSJ^ GTTGGCGTCG GCAAATGTGC CGNACCCGTG GATCTCGGTG ATCTTgSS 120 
TCTTCATCAG GAAGTG CACA CCGGCCACCC TGCCCTCGGN TACCTTTCGG 110 

(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 127 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:48: 

SSSS SSSS?? GCCGGCGGCA GCACCGCTGG CGCTGGCGGC AACGGCGGGG 
gSS? cggcggaacc ^TGGGTTGC TCTTCGGCAA CGGCGGTGCC GGCGGGCACG 120 

127 

(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 81 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xl) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

SSSS G CCGGCAACGG GAGCGGCGCG GCCGGCGG ^ ACGGCGGCAA S0 

81 

(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear ' 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO: 50; 



GATCAGGGCT GGCCGGCTCC GGCCAGAAGG GCGGTAACGG AGGAGCTGCC GGATTGTTTG 60 
GCAACGGCGG GGCCGGNGGT GCCGGCGCGT CCAACCAAGC CGGTAACGGC gSgSSS JS 
GAAACGGTGG TGCCGGTGGG CTGATCTGG ^i^^c ^NGCCGGCG 12 0 
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(2) 



INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 51: 

TCGAAGTaS S^SSS t ^ f***^ ««««« CTGGACTGGT 

CTATGAAAGT SSScS SSSS 1^^°° ^GCGCAC TTCCAGGTGA 
GGTGCATCAT TAaSSS ScSSS SJ^"* «*«^ CGATAACTGA 
ACGGTGGCTC CGCcSS Sg^^S LSS^ GGTTCAGCCG 

CTGC^TCCAA AATCCCTGCG ACAATTCGTC GGCGG 

(2) INFORMATION FOR SEQ ID NO: 52: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 999 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 



(l) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 3 32 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:52: 

A 5SSS SESS SS^z 0IB " 0TO ACTTGACACG ™*»«* 

CCCGCGACCG cSS££I ^c.gcgatg gccagcgcca gcctggtgac cgttgcggtg 

ccgcScS SJSSS TACCCACAAC 

GCCGCCGCCA ACACR-r-i* ~^^r CCGvjCGACAC CTGTTGCCCC CCCACCACCG 

gacccSS J£S££ SSSSS S CGATCCCA acgcagcacc tccgccggcc 

GACAACCCGG TTgSgSS SgC^SS CRCCCCM « TGTCCGGATC 

GCCCACTTCG AC-AC^TTP CTGCCTGCTG GCTGGGTGGA GTCTGACGCC 

ggacagSg? cSS?Sc A ^I CCTC agcaaaacca ccgggqaccc GCCATTTCCC 

rJ^r-Z: ^° GTGGC ^AATGACACC CGTATCGTGC TCGGCCGGCT AGACCAAAAG 

ggtgLSt £££££ SSSS CCCGOTGGG ^S^G 

GCCAACGGGG SSSSg AGGAAACCGT CTCGCTCGAC 

CCGAACGGCC AGAtSSaI SSSSJ SS^*™ AGTTCAGCGA TCCGAGTAAG 
GGGCCCCCTC AGC^™ ~I ° GGCTCGCCCS CGGCGAACGC ACCGGACGCC 

ggcgcSS SSSc SS? Sg™ 05 CCAACAACCC 

GCACCGGCTC CTGCAGAGrr CGGCCTTTGG TCGCCCCGCC GCCGGCGCCG 

ccgacgaSc ssss ssss SSS 8 ccggggaagt cgctcctacc in 

(2) INFORMATION FOR SEQ ID NO: 53: 
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(XI) SEQUENCE DESCRIPTION: SEQ ID NO:5 3: 
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Met His His His His His His „ et His Gl„ Val Asp p ro ^ Leu Thr 
Ar 5 Arg Lys „ y Arg Leu Ma Ma £ ^ ^ ^ ^ ^ ^ 

Ala Ser ,eu Val Thr Val Ala Val Pro Ala Thr Ala Asn Ma Asp Pro 
Glu Pro Ala Pro Pro Val p ro H Thr ^ ^ Ser £ ^ ^ ^ 

Ala Ala Ala Pro Pro Ala Pro Ala Thr ^ val "a Pro Pro Pro Pro 
Ala Ala Ala Asn Thr Pro ^ ^ Gla prQ £ y ^ ^ ^ ^ « 
«o P„ Pro Ala Aap p„ to Ul p „ , ro pro ^ ue £ 
As- Ala Pro sin P ro val ^ ^ % ^ ^ ^ 110 ^ ^ 
p he m. l.» ,„ 01y ^ "I Glu ser ^ ^ "S ^ ^ ^ 
J»x «y S„ Ma u. ser Lys lat Ihr 01y £ fro pM ^ ^ 
Gly cm to p™ pro v,! ai. „ ThI ™ n , Vil Uu ^ «o 

«. A=p Gl« Jj. Leu ^ Ua s „ HI ^ ^ ^ ^ «. ^ 

Ala Al. at, z^u G1 y ser ^ M „ ™ 01n me ^ 1» ^ ^ 

Gly Thx Ax 3 U. Asn Gin «u Z val s.r L.u A,p 111 As. Gly Val 
S-r Gly S« Ala 3a, Ty. £ „„ Val Lys p „. » Mp ^ ^ ^ 
Asn Gly an £ „ TSr Gly Val n . ~ _ ^ ^ ^ «• 
Al. P» Asp Al, » y Pro Pro G1 „ ^ g ™ ^ ^ ^ 2SS ^ 

Th, A!a Aa» Asn P„ val Aap , y , I" u , u . Lys M , £J ^ ^ 
S.r II. Arg P„ L . u Vll '» pro ^ Ua ^ »• ^ ^ ^ 

Ala Glu Pro Ala Pro Al* 3^ n , 300 

305 ?r ° Ala ?ro G1 y Val Ala Pro Thr 

Pro Thr Thr Pro Thr Pro Gla ^ g Thr Leu ^ JQa 

T5C 



320 



325 330 



(2) INFORMATION FOR SEQ ID NO: 54: 

<i) SEQUENCE CHARACTERISTICS * 

(A) LENGTH: 20 ammo acids 

(B) TYPE: ammo acid - 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

Asp Pro Val Asp Ala Val ^ e Asn v 

-^e Asn .hr Thr Xaa Asn Tyr Gly Gin Val 
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1 5 10 

Val Ala Ala Leu 15 
20 

(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

Ala Val Glu Ser Gly Met Leu Ala Leu Gly Thr Pro Ala Pro Ser 

10 15 

(2) INFORMATION FOR SEQ ID NO: 56: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

Ala Ala Met Lys Pro Arg Thr Gly Asp Gly Pro Leu Glu Ala Ala Lys 
Glu Gly Arg 10 15 

(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: IS amino acids 
(BJ TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

Tyr Tyr Trp Cys Pro Gly Gin Pro Phe Asp Pro Ala Trp Gly Pro 

10 15 

(2) INFORMATION FOR SEQ ID NO: 58: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 
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Asp lie Gly ser G l u S er Thr Glu Asp Gin Gin Xaa Ala Val 

5 10 
(2) INFORMATION FOR SEQ ID NO: 59: 

Ci) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:59: 

Ala Glu Glu Ser He Ser Thr Xaa Glu Xaa lie Val Pro 
5 10 
(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

{xi} SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

ASP Pro Glu Pro Ala Pro Pro Val Pro Thr Ala Ala Ala Ala Pro Pro 
Ala 10 15 

(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 15 ammo acids 

(B) TYPE: ammo acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(XI J SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

Ala Pro Lys Thr Tyr Xaa Glu Glu Leu Lys Gly Thr Asp Thr Gly 
(2) INFORMATION FOR SEQ ID NO: 62: 



5 io 1S 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

Asp Pro Ala Ser Ala ?ro Asp Val Pro Tnr Ma ^ Qln Gln ^ ^ 

10 15 
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Leu Leu Asn Asn Leu Ala Asp Pro Asp Val Ser Phe Ala Asp 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 187 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

Thr Gly Ser Leu Asn Gin Thr His Asn Arg Arg Ala Asn Glu Arg Lys 

1 5 10 15 

Asn Thr Thr Met Lys Met Val Lys Ser lie Ala Ala Gly Leu Thr Ala 

20 25 30 

Ala Ala Ala lie Gly Ala Ala Ala Ala Gly Val Thr Ser lie Met Ala 

35 40 45 

Gly Gly Pro Val Val Tyr Gin Met Gin Pro Val Val Phe Gly Ala i> ro 

50 55 60 

Leu Pro Leu Asp Pro Ala Ser Ala Pro Asp Val Pro Thr Ala Ala Gin 
S5 7 ° 75 bo 

Leu Thr Ser Leu Leu Asn Ser Leu Ala Asp Pro Asn Val Ser Phe Ala 

85 90 95 

Asn Lys Gly Ser Leu Val Glu Gly Gly He Gly Gly Thr Glu Ala Arg 

100 105 no 

He Ala Asp His Lys Leu Lys Lys Ala Ala Glu His Gly Asp Leu Pro 

115 120 125 

Leu Ser Phe Ser Val Thr Asn He Gin Pro Ala Ala Ala Gly Ser Ala 

130 135 140 

Thr Ala Asp Val Ser Val Ser Gly Pro Lys Leu Ser Ser Pro Val Thr 
145 150 155 160 

Gin Asn Val Thr Phe Val Asn Gin Gly Gly Trp Met Leu Ser Arg Ala 

165 170 175 

Ser Ala Met Glu Leu Leu Gin Ala Ala Gly Xaa 
180 185 

(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 148 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

Asp Glu Val Thr Val Glu Thf Thr Ser Val Phe Arg Ala Asp Phe Leu 

1 5 10 is 

Ser Glu Leu Asp Ala Pro Ala Gin Ala Gly Thr Glu Ser Ala Val Se- 

20 25 30 

Gly Val Glu Gly Leu Pro Pro Gly Ser Ala Leu Leu Val Val Lys Arg 

35 40 45 

Gly Pro Asn Ala Gly Ser Arg Phe Leu Leu Asp Gin Ala He Thr Ser 
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50 55 60 

Ala Gly Arg His Pro Asp Ser Asp He Phe Leu Asp Aso Val Thr Val 
65 70 75 " 80 

Ser Arg Arg His Ala Glu Phe Arg Leu Glu Asn Asn Glu Phe Asn Val 

85 90 95 

Val Asp Val Gly Ser Leu Asn Gly Thr Tyr Val Asn Arg Glu Pro Val 

100 los no 

Asp Ser Ala Val Leu Ala Asn Gly Asp Glu Val Gin He Gly Lys Leu 

H5 120 125 

Arg Leu Val Phe Leu Thr Gly Pro Lys Gin Gly Glu Asp Asp Gly Ser 

130 135 140 

Thr Gly Gly Pro 
145 



(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:65: 



Thr 


Ser 


Asn 


Arg 


Pro 


Ala 


Arg 


Arg 


Gly Arg Arg Ala 


Pro 


Arg Asp 


Thr 


1 








5 










10 










15 




Gly Pro Asp Arg 


Ser 


Ala 


Ser 


Leu 


Ser 


Leu 


Val 


Arg 


His 


Arg Arg 


Gin 








20 










25 










30 






Gin Arg Asp 


Ala 


Leu 


Cys 


Leu 


Ser 


Ser 


Thr 


Gin 


He 


Ser 


Arg Gin 


Ser 






35 










40 










45 








Asn 


Leu 


Pro 


Pro 


Ala 


Ala 


Gly 


Gly 


Ala 


Ala 


Asn 


Tyr 


Ser 


Arg Arg 


Asn 




50 










55 










60 










Phe 


Asp 


Val 


Arg 


He 


Lys 


He 


Phe 


Met 


Leu 


Val 


Thr 


Ala 


Val 


Val 


Leu 


65 










70 










75 










30 


Leu 


Cys 


Cys 


Ser 


Gly 


Val 


Ala 


Thr 


Ala 


Ala 


Pro 


Lys 


Thr 


Tyr 


Cys 


Glu 


Glu 








85 










90 










95 




Leu 


Lys 


Gly 


Thr 


Asp 


Thr 


Gly 


Gin 


Ala 


Cys 


Gin 


He 


Gin 


Met 


Ser 








100 










105 










110 






Asp 


Pro 


Ala 


Tyr 


Asn 


He 


Asn 


He 


Ser 


Leu 


Pro 


Ser 


Tyr 


Tyr 


Pro 


Asp 






115 










120 










125 






Gin 


Lys 


Ser 


Leu 


Glu 


Asn 


Tyr 


He 


Ala 


Gin 


Thr Arg 


Asp 


Lys 


Phe 


Leu 




130 










135 










14 0 










Ser 


Ala 


Ala 


Thr 


Ser 


Ser 


Thr 


Pro 


Arg 


Glu 


Ala 


Pro 


Tyr 


Glu 


Leu 


Asn 


145 










150 










155 








160 


He 


Thr 


Ser 


Ala 


Thr 


Tyr 


Gin 


Ser 


Ala 


He 


Pro 


Pro 


Arg 


Gly Thr 


Gin 










165 










170 










175 




Ala 


Val 


Val 


Leu 


Xaa 


val 


Tyr 


His 


Asn 


Ala 


Gly Gly Thr 


His 


Pro 


Thr 








180 










185 










190 






Thr 


Thr 


Tyr 


Lys 


Ala 


Phe 


Asp 


Trp 


Asp Gin Ala 


Tyr 


Arg 


Lys 


Pro 


He 






195 










200 










205 








Thr 


Tyr Asp 


Thr 


Leu 


Trp 


Gin 


Ala 


Asp 


Thr 


Asp 


Pro 


Leu 


Pro 


Val 


Val 




210 










215 








220 










Phe 


Pro 


lie 


Val 


Ala 


Arg 























225 230 
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(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 2 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

Thr Ala Ala Ser Asp Asn Phe Gin Leu Ser Gin Gly Gly Gin Gly Phe 

Ala lie Pro lie Gly Gin Ala Met Ala He Ala Gly Gin lie ^rg Ser 

Gly Gly Gly Ser Pro Thr Val His lie Gly Pro Thr Ala Phe Leu Gly 

40 45 
Leu Gly Val Val Asp Asn Asn Gly Asn Gly Ala Arg Val Gin Arg Val 

50 5 5 60 

Val Gly Ser Ala Pro Ala Ala Ser Leu Gly lie Ser Thr Gly Asp Val 

lie Thr Ala Val Asp Gly Ala Pro He Asn Ser Ala Thr Ala Met IL 

Asp Ala Leu Asn Gly His His Pro Gly Asp Val lie Ser Val Trp 

Gin Thr Lys Ser Gly Gly Thr Arg Thr Gly Asn Val Thr tin Ala Glu 

120 ±25 

Gly Pro Pro Ala 
130 

(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: sxngle 

(D) TOPOLOGY : linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 



10 15 
Arg Arg Leu Ser Asn Pre 
30 

Pro Ala Thr Ala Ser Ale 
45 

Trp Arg Gly Pro Ala Thi 
60 

: K±a uiy Met Ala Arq Val Ara Atvt t™ y aa 

65 ™ 7 C 

Gly Pro Phe Asp Asn Arg Gl> 
90 95 



Pro 


Leu Arg 


Ser 
5 


Pro 


Ser 


Met 


Ser 


Gin 


Arg Asn 


Pro 


Val 


He 


Arg 


Arg 




20 










25 


Arg 


Lys Tyr 


Arg 


Ser 


Met 


Pro 


Ser 




35 








40 




Ala 


Arg Val 


Arg 


Arg Arg Ala 


He 


50 








55 






Ala 


Gly Met 


Ala 


Arg 


Val 


Arg 


Arg 








70 








Gin 


Ser Thr 


Xaa 


He 


Arg 


Xaa 


Xaa 






85 










Glu 


Arg Lys 














100 













(2) INFORMATION FOR SEQ ID NO: 63 



WO 99/42076 



PCT/US99/03268 



99 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 163 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 

Met Thr Asp Asp lie Leu Leu He Asp Thr Asp Glu Arg Val Arg Thr 

5 10 15 

Leu Thr Leu Asn Arg Pro Gin Ser Arg Asn Ala Leu Ser Ala Ala Leu 
20 25 30 



Arg Asp Arg Phe Phe Ala Xaa Leu Xaa Asp Ala Glu Xaa Asp Asp Asp 
He Asp val Val lie Leu Thr Gly Ala Asp Pro Val Phe Cys Ala Gly 



Leu Asp Leu Lys Val Ala Gly Arg Ala Asp Arg Ala Ala Gly His Leu 

70 7 5 80 

Thr Ala val Gly Gly His Asp Gin Ala Gly Asp Arg Arg Asp Gin Arg 

85 90 95 

Arg Arg Gly His Arg Arg Ala Arg Thr Gly Ala Val Leu Arg His Pro 

100 105 no 

Asp Arg Leu Arg Ala Arg Pro Leu Arg Arg His Pro Arg Pro Gly Gly 

115 120 12 s 

Ala Ala Ala His Leu Gly Thr Gin Cys Val Leu Ala Ala Lys Gly Arg 

130 135 140 

His Arg Xaa Gly Pro Val Asp Glu Pro Asp Arg Arg Leu Pro Val Arg 

» 150 155 16 Q 

Asp Arg Arg 0 

(2) INFORMATION FOR SEQ ID NO: 69: 

ii) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 344 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

ixi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 



Met 


Lys 


Phe 


Val 


Asn 


His 


He 


Glu 


Pro 


Val 


1 








5 










10 


Gly 


Ala 


Val 


Ala 


Glu 


val 


Tyr 


Ala 


Glu Ala 








20 










25 




Leu 


Pro 


Glu 


Pro 


Leu 


Ala 


Met 


Leu 


Ser 


Pro 


Ala 




35 










40 






Gly 
50 


Trp 


Ala 


Thr 


Leu 


Arg 
5^ 


Glu 


Thr 


Leu 


Arg 


Gly 


Arg 


Lys 


Glu 


Ala 


Val 


Ala 


Ala 


Ala 


65 










70 








Cys 


Pro 


Trp 


Cys 


val 


Asp 


Ala 


His 


Thr 


Thr 


Gin 








85 










90 


Thr 


Asp 


Thr 
100 


Ala 


Ala 


-Ala 


He 


Leu 
105 


Ala 



15 

Arg Arg Glu Phe Gly Arc 
30 

Asp Glu Gly Leu Leu Tin 
45 

Leu Val Gly Gin Val Pre 
60 

Val Ala Ala Ser Leu Arc 
75 so' 
Met Leu Tyr Ala Ala Gl} 
95 

Gly Thr Ala Pro Ala Ale 
110 
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Gly Asp Pro Asn Ala Pro Tyr Val Ala Trp Ala Ala Gly Thr Gly Thr 

120 12S 
Pro Ala Gly Pro Pro Ala Pro Phe Gly Pro Asp Val Ala Ala Glu Tyr 

Leu Gly Thr Ala Val Gin Phe His Phe lie Ala Arg Leu Val Leu Val 

Leu Leu Asp Glu Thr Phe Leu Pro Gly Gly Pro Arg Ala Gin Gin Leu 

170 

Met Arg Arg Ala Gly Gly Leu Val Phe Ala Arg Lys Val Arg Al ? a Glu 

185 190 
His Arg Pro Gly Arg Ser Thr Arg Arg Leu Glu Pro Arg Thr Leu Pro 

200 205 
Asp Asp Leu Ala Trp Ala Thr Pro Ser Glu Pro He Ala Thr Ala Phe 

Ala Ala Leu Ser His His Leu Asp Thr Ala Pro Hil Leu Pro Pro Pro 

Thr Arg Gin Val Val Arg Arg Val Val Gly Ser Trp His Gly Glu Pro 

245 250 
Met Pro Met Ser Ser Arg Trp Thr ^ n Glu H±s Thr ^ ^ ^ ^ 

Ala Asp Leu His Ala Pro Thr Arg Leu Ala Leu Leu Thr lly Leu Ala 
Pro His Gin Val Thr Asp Asp £!p val Ala Ala Ala ^g Ser Leu Leu 
Asp Thr Asp Ala Ala Leu lit Gly Ala Leu Ala J£ Ala Ala Phe Thr 

Ala Ala Arg Arg lie gJJ Thr Trp He Gly 21 Ala Ala Glu Gly G^n 

,r i „ 325 3 30 335 

Val Ser Arg Gin Asn Pro Thr Gly 

340 

(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 485 ammo acids 

(B) TYPE : amino acid 

(C) STRAND EDNESS : single 
<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 70: 

ASP Asp Pro Asp Met Pro Gly Thr Val Ala Lys Ala Val Ala Asp Ala 
5 10 15 

Leu Gly Arg Gly n e Ala Pro Val Glu Asp lie Gin Asp Cys Val Glu 

Ala Arg Leu Gly Glu Ala Gly Leu Asp Asp Val Ala Arg Val Tyr He 

He Tyr Arg Gin Arg Arg Ala Glu Leu Arg Thr Ala Lys Ala Leu Leu 
50 5S 



Gly Val Arg Asp Glu Leu Lys Leu Ser Leu Ala Ha Val Thr Val Leu 

70 75 
Arg Glu Arg Tyr Leu Leu His Asp Glu Gin Gly Arg Pro Ala Glu Ser 

85 90 95 

Thr Gly Glu Leu Me, Asp Arg Ser Ala Arg Cys Val Ala Ala Ala Glu 
100 105 110 
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Asp Gin Tyr Glu Pro Gly Ser Ser ^ ^ Trp ^ ^ ^ ^ ^ 



120 



Thr Leu Leu Arg Asn Leu Glu ^ Leu Pro Ser £ Thr ^ ^ 



135 



Asn Ser Gly Thr Asp Leu Gly Leu Leu ^ Qly ^ ^ ^ ^ 

He Olu Asp Ser Leu Gin Ser Xle Phe Ma Thr Leu Gly G ln Ma £ 
Clu Leu Gin Arg Ma Gly Qly Qly ^ J£ ^ ^ ^ ^ 175 ^ 

Ar, Pro Ala Gly Asp Arg Val Ala Ser Thr Gly Gly Thr £ Ser Qly 
Pro val ser Phe Leu Arg Leu T^r Asp Ser Ala Ala oJ y Val Val Ser 
Me, Gly Gly Arg Arg Arg Gly Ala Cys Met Ala III Leu Asp Val Ser 
His Pro Asp Xle Cys Asp P he Val Thr Ala Lys Ala Glu Ser Pro III 
Clu Leu Pro His Phe Asn Leu Ser Val Gly Val Thr Asp Ala Phe Leu 
Arg Ala Val Glu Arg Asn Gly Leu h" Arg Leu Val Asn III Arg Thr 
Gly Ly. Xle Val Ala Arg Met £ Ala Ala Glu Leu Phi Asp Ala Ue 

Cys Lys Ala Ala His Ala Gly Gly Asp Pro Gly Leu Val Phe Leu Asp 

315 

Thr lie Asn Arg Ala Asn Pro Val Pro Gly Arg Gly Arg lie Glu Ma 
Thr Asn Pro Cys Gly Glu Val Pro Leu III Pro Tyr Glu Ser Cys Asn 
Leu Gly ser Xle Asn Leu Ala Arg Met Leu Ala Asp Gly Arg Val Asp 
Trp Asp Arg Leu Glu Glu Val III Gly Val Ala Val Arg Phe Leu Asp 
Asp Val lie Asp Val Ser Arg Tyr Pro Phe Pro III Leu Gly Glu Ala 
Ala Arg Ala Thr Arg Lys lie Gly Leu Gly III Met Gly Leu Ala III 
Leu Leu Ala Ala Leu Gly Xle Pro Tyr Asp Ser Glu Glu Ala III Arg 
Leu Ala Thr Arg Leu Met Arg Arg III Gin Gin Ala Ma Z T hr Ala 
Ser Arg Arg Leu Ala Glu Glu Arg Gly Ala Phe Pro Ma Phe Thr Asp 
Ser Arg Phe Ma Arg Ser HI Pro ^ ^ ^ £ Gla ^ ^ ^ 

Val Ala Pro Thr Gly 475 480 

485 

(2) INFORMATION FOR SEQ ID NO:7l : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 267 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS : single 
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(C) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 

Gly Val lie Val Leu Asp Leu Glu Pro Arg Gly Pro Leu Pro Thr Glu 

5 10 15 

He Tyr Trp Arg Arg Arg Gly Leu Ala Leu Gly He Ala Val Val Val 

^ ^ ^ ^ 30 

val Gly lie Ala Val Ala He Val lie Ala Phe Val Asp Ser Ser Ala 

Gly Ala Lys Pro Val Ser Ala Asp Lys Pro Ala Ser Ma Gin Ser His 

S ° 55 60 

Pro Gly Ser Pro Ala Pro Gin Ala Pro Gin Pro Ala Gly Gin Thr Glu 

Gly Asn Ala Ala Ala Ala Pro Pro Gin Gly gL Asn Pro Glu Thr Pro 

Thr Pro Thr Ala Ala Val Gin Pro Pro Pro Val Leu Lys Glu Glv Aso 

105 110 
Asp Cys Pro Asp Ser Thr Leu Ala Val Lys Gly Leu Thr Asn Ala Pro 

Gin Tyr Tyr Val Gly Asp Gin Pro Lys Phe Thr Met III Val Thr Asn 

"* 14 0 

lie Gly Leu Val Ser Cys Lys Arg Asp Val Gly £1 Ala Val Leu Ala 

Ala Tyr Val Tyr Ser Leu Asp Asn Lys Arg £eu Trp Ser Asn Leu ip 



155 170 175 



cys Ala Pro Ser Asn Glu Thr Leu Val Lys Thr Phe Ser Pro Gly Glu 

180 190 

Gin Val Thr Thr Ala Val Thr Trp Thr Gly Met Gly Ser Ala Pro Arg 

195 2 °0 205 

Cys Pro Leu Pro Arg Pro Ala lie Gly Pro Gly Thr Tyr Asn Leu Val 

■J 1U ~t t i- 



-15 220 
Val Gin Leu Gly Asn Leu Arg Ser Leu Pro Val Pro Phe lie Leu Asn 

Gin Pro Pro Pro Pro Pro Gly Pro Val Pro Ala Pro Gly Pro Ala 111 



245 250 25 = 



Ala Pro Pro Pro Glu Ser Pro Ala Gin Gly Gly 
260 265 

12) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 97 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:72: 

Leu lie Ser Thr Gly Lys Ala Ser His Ala Ser Leu Gly Val Gin Val 

Thr Asn Asp Lys Asp Thr Pro Gly Ala Jys He Val Glu Val Val Ala 

2 0 25 
Gly Gly Ala Al a Ala Asn Ala Gly Val Pro Lys Gly Val Val Val Thr 



35 40 

40 45 
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Lys Val Asp Asp Arg Pro lie Asn Ser Ala Asp Ala Leu Val Ala Ala 

50 55 go 

Val Arg Ser Lys Ala Pro Gly Ala Thr Val Ala Leu Thr Phe Gin Asp 
70 75 * 

Pro Ser Gly Gly ser Arg Thr Val Gin Val Thr Leu Gly Lys Ala Glu 

Gin 

(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 364 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 

Gly Ala Ala Val Ser Leu Leu Ala Ala Gly Thr Leu Val Leu Thr Ala 
Cys Gly Gly Gly Thr Asn Ser Ser Ser Ser Gly Ala Gly Gly Thr Ser 
Gly Ser Val His Cys Gly Gly Lys Lys Glu Leu His Ser Ser Gly Ser 
Thr Ala Gin Glu Asn Ala Met Glu Gin Phe Val Tyr Ma Tyr Val Arg 
Ser Cys Pro Gly Tyr Thr Leu Asp Tyr Asn Ala Asn Gly Ser Gly Ala 

Gly Val Thr Gin Phe Leu Asn Asn Glu Thr Asp Phe Ala Gly Ser Asp 

85 90 95 

Val Pro Leu Asn Pro Ser Thr Gly Gin Pro Asd Arg Ser Ala Glu Arg 

100 105 
v-ys Gly ser Pro Ala Trp Asp Leu Pro Thr Val Phe Gly Pro lie Ala 

He Thr Tyr Asn He Lys Gly VaS Ser Thr Leu Asn tit Asp Gly Pro 

13 5 140 
Thr Thr Ala Lys lie ?he Gly Thr He Thr Val Trp Asn Asp Pro 

160 

Gin He Gin Ala Leu Asn Ser Gly Thr Asp Leu Pro Pro Thr Pro lie 

ser Val lie Phe Arg Ser Asp Lys Ser Gly Thr Ser Asp Asn Phe Gin 
180 185 190 



Lys Tyr Leu Asp Gly Val Ser Asn Gly Ala Trp Gly Lys Gly Ala Ser 

195 200 205 

Glu Thr Phe Ser Gly Gly Val Gly Val Gly Ala Ser Gly Asn Asn Gly 

Thr Ser Ala Leu Leu Gin Thr Thr Asp Gly Ser He Thr Tyr Asn Glu 

230 

Trp Ser Phe Ala Val Gly Lys. Gin Leu Asn Met Ala Gin lie He Tnr 

245 250 ,55 

Ser Ala Gly Pro Asp Pro Val Ala He Thr Thr Glu Ser Val Gly Lys 

2 65 270 
Thr lie Ala Gly Ala Lys He Met Gly Gin Gly Asn Asp Leu Val Leu 

28 0 

Asp Thr Ser Ser Phe Tyr Arg Pro Thr Gin Pro Gly Ser Tyr Pro He 
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290 295 300 

Val Leu Ala Thr Tyr Glu He Val Cys Ser Lys Tyr Pro Asp Ala Thr 
5 310 315 320 

Thr Gly Thr Ala Val Arg Ala Phe Met Gin Ala Ala lie Gly Pro Gly 

325 330 335 

Gin Glu Gly Leu Asp Gin Tyr Gly Ser lie Pro Leu Pro Lys Ser Phe 

340 345 350 

Gin Ala Lys Leu Ala Ala Ala Val Asn Ala lie Ser 
355 360 

(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 309 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 

Gin Ala Ala Ala Gly Arg Ala Val Arg Arg Thr Gly His Ala Glu Asp 

10 15 
Gin Thr His Gin Asp Arg Leu His His Gly Cys Arg Arg Ala Ala Val 

20 25 30 

val val Arg Gin Asp Arg Ala Ser Val Ser Ala Thr Ser Ala Arg Pro 

35 40 45 

Pro Arg Arg His Pro Ala Gin Gly His Arg Arg Arg Val Ala Pro Ser 

50 55 go 

Gly Gly Arg Arg Arg Pro His Pro His His Val Gin Pro Asp Asp Arg 

! 70 75 80 

Arg Asp Arg Pro Ala Leu Leu Asp Arg Thr Gin Pro Ala Glu His Pro 

85 90 95 

Asp Pro His Arg Arg Gly Pro Ala Asp Pro Glv Arg Val Arg Gly *rg 

100 105 ■'■'0 

Gly Arg Leu Arg Arg Vai Asp Asp Gly Arg Leu Gin Pro Asp Arg Aso 

115 120 12 s 

Ala Asp His Gly Ala Pro Val Arg Gly Arg Gly Pro His Arg Gly Val 

130 135 140 

Gin His Arg Gly Gly Pro Val Phe Val Arg Arg Val Pro Gly Val Arg 

150 155 16u 

Cys Ala His Arg Arg Gly His Arg Arg Val Ala Ala Pro Gly Gin Gly 

165 170 175 

Asp Val Leu Arg Ala Gly Leu Arg Val Glu Arg Leu Arg Pro Val Ala 

180 185 190 

Ala Val Glu Asn Leu His Arg Gly Ser Gin Arg Ala Asp Gly Arg Val 

r>u » 195 200 205 

Phe Arg Pro lie Arg Arg Gly Ala Arg Leu Pro Ala Arg Arg Ser Arg 

210 215 220 

Ala Gly Pro Gin Gly Arg Leu His Leu Asp Gly Ala Gly Pro Ser Pro 
225 230 235 240 



Leu Pro Ala Arg Ala Gly Gin Gin Gin Pro Ser Ser Ala Gly Gly Arg 

245 250 255 

Arg Ala Gly Gly Ala Glu Arg Ala Asp Pro Gly Gin Arg Gly Arg His 

260 265 270 

His G.n Gly Gly His Asp Pro Gly Arg Gin Gly Ala Gin Arg Glv Thr 
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275 280 285 

Ala Gly Val Ala His Ala Ala Ala Gly Pro Arg Arg Ala Ala Val Arg 

295 300 

Asn Arg Pro Arg Arg 
305 

(2) INFORMATION FOR SEQ ID NO; 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 580 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

fxi} SEQUENCE DESCRIPTION: SEQ ID NO: 75: 

Ser Ala Val Trp Cys Leu Asn Gly Phe Thr Gly Arg His Arg His Gly 

Arg Cys Arg Val Irg Ala Ser Gly Trp ^g Ser Ser Asn Arg tL Cvs 

25 3 0 ~ " 

Ser Thr Thr Ala Asp Cys Cys Ala Ser Lys Thr Pro Thr Gin Ala Ala 

Ser Pro Leu Glu Arg Arg Phe Thr Cys Cys Ser Pro Ala Val Gly Cys 

Arg Phe Arg Ser Phe Pro Val Arg Arg Leu Ala Leu Gly Ala Arg Thr 



Ser Arg Thr Leu Gly Val Arg Arg Thr Leu Ser Gin Trp Asn Leu Ser 

85 90 95 

Pro Arg Ala Gin Pro Ser Cys Ala Val Thr Val Glu Ser His Thr His 

100 105 110 

Ala Ser Pro Arg Met Ala Lys Leu Ala Arg Val Val Gly Leu Val Gin 

Glu Glu Gin ?ro Ser Asp Met Thr Asn His Pro Arg t" Ser Pro Pro 

13 5 

Pro Gin Gin Pro Gly Thr Pro Gly Tyr Ala Gin Gly Gin Gin Gin Thr 

Tyr Ser Gin Gin Phe As"p Trp Arg Tyr Pro III Ser Pro Pro Pro Gin 

165 170 175 

Pro Thr Gin Tyr Arg Gin Pro Tyr Glu Ala Leu Gly Gly Thr Arg =ro 

180 185 190 

Gly Leu lie Pro Gly Val prQ Thr ^ ^ ^ ^ ^ 

200 205 
Val Arg Gin Arg Pro Arg Ala Gly Met Leu Ala lie Gly Ala Val Thr 

lie Ala Val Val Ser Ua Gly Ile ffly Qly ^ ^° ^ ^ ^ 

230 235 
Gly Phe Asn Arg Ala Pro Ala Gly Pro Ser Gly Gly Pro Val Ala A^a 

245 250 255 

Ser Ala Ala Pro Ser He Pro Ala Ala Asn Met Pro Pro Gly Ser Val 

Glu Gin Val Ala Ala Lys Val Val III Ser Val Val Met L^u Glu Thr 



275 280 285 



Asp Leu Gly Arg Gin Ser Glu Glu Gly Ser Gly lie III Leu Ser Ala 

2 95 

Glu Gly Leu Zle Leu Thr Asn Asn His Val He Ala Ala Ala Ala Lys 
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(2) 



305 310 315 320 

Pro Pro Leu Gly S er Pro Pro Pro Lys Thr Thr Val Thr Phe Ser Asp 

325 330 335 

Gly Arg Thr Ala Pro Phe Thr Val Val Gly Ala Asp Pro Thr Ser Asp 
340 345 350 P 

He Ala val Val Arg Val Gin Gly Val Ser Gly Leu Thr Pro lie Ser 

360 365 
Leu Gly Ser Ser Ser Asp Leu Arg Val Gly Gin Pro Val Leu Ala lie 

375 380 
Gly Ser Pro Leu Gly Leu Glu Gly Thr Val Thr Thr Gly lie Val Ser 

05 390 ;gc 

Ala Leu Asn Arg Pro Val Ser Thr Thr Gly Glu Ala Gly Asn Gin Asn 

405 410 415 

Thr Val Leu Asp Ala lie Gin Thr Asp Ala Ala lie Asn Pro Gly Asn 

420 42 5 430 

Ser Gly Gly Ala Leu Val Asn Met Asn Ala Gin Leu Val Gly Val Asn 

440 44S 

Ser Ala He Ala Thr Leu Gly Ala Asp Ser Ala Asp Ala Gin Ser Gly 

Ser lie Gly Leu Gly Phe Ala He Pro Val Asp Gin Ala Lys Arg lie 

Ala Asp Glu Leu lie sir Thr Gly Lys Ala Ser His Ala Ser Leu Ty 

_ _ 485 490 495 

Val Gin Val Thr Asn Asp Lys Asp Thr Pro Gly Ala Lys lie Val Glu 

505 510 
Val val Ala Gly Gly Ala Ala Ala Asn Ala Gly Val Pro Lys Gly Val 

520 525 
Val Val Thr Lys Val Asp Asp Arg Pro lie Asn Ser Ala Asp Ala Leu 

535 540 
val Ala Ala Val Arg Ser Lys Ala Pro Gly Ala Thr Val Ala Leu Thr 

Phe Gin Asp Pro Ser Gly Gly Ser Arg Thr vll Gin Val Thr Leu Gl'y 

50Z> 57 <> 575 

Lys Ala Glu Gin * b 

580 

INFORMATION FOR SEQ ID NO : 76 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 23 3 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS ; single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO : 76 : 

Met Asn Asp Gly Lys Arg Ala Val Thr Ser Ala Val Leu Val Val Leu 

5 \Q 
Gly Ala Cys Leu Ala Leu Trp_ Leu Ser Gly Cys Ser Ser Pro l^s Pro 

Asp Ala Glu Glu Gin Gly val Pro 111 Ser Pro Thr Ala Ser Asp Pro 

Ala Leu Leu Ala Glu lie Arg Gin Ser Leu Asp Ala Thr Lys Gly Leu 

Thr Ser Val His Val Ala val Arg Thr Thr Gly Lys Val Asp 3er Leu 
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65 

Leu Gly lie Thr Ser 
85 

Ala Lys Gly Val Cys 
100 

Val Gin Gly Asp Asn 
115 

Leu Gly Ser He Ser 
130 

Ala Gly Val Thr Gin 
145 

Gly Thr Glu Val He 
165 

He Pro Ala Ser Ser 
180 

Arg Pro Ala Thr Val 

195 

Arg Ala Ser He Asp 
210 

Lys Trp Asn Glu Pro 
225 



70 75 
Ala Asp Val Asp Val Arg 
90 



Thr Tyr Asn 


Asp 


Glu 


Gin 




105 






He Ser Val 


Lys 


Leu 


Phe 


120 








Glu Leu Ser 


Thr 


Ser Arg 


135 








Leu Leu Ser 


Gly 


Val 


Thr 


150 






155 


Asp Gly He 


Ser 


Thr 


Thr 






170 




Val Lys Met 


Leu 


Asp 


Pro 




185 






Trp He Ala 


Gin 


Asp Gly 


200 








Leu Gly Ser 


Gly 


Ser 


He 


215 








Val Asn Val 


Asp 






230 









80 

Ala Asn Pro Leu Ala 
95 

Gly Val Pro Phe Arg 
110 

Asp Asp Trp Ser Asn 
125 

Val Leu Asp Pro Ala 
140 

Asn Leu Gin Ala Gin 
160 

Lys He Thr Gly Thr 
175 

Gly Ala Lys Ser Ala 
190 

Ser His His Leu Val 
205 

Gin Leu Thr Gin Ser 
220 



(2) INFORMATION FOR SEQ ID NO: 77: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 66 amino acids 

(B) TYPE: amino acid 

(C) S HANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 

Val He Asp He He Gly Thr Ser Pro Thr Ser Trp Glu Gin Ala Ala 

1 5 io 15 

Ala Glu Ala Val Gin Arg Ala Arg Asp Ser Val Asp Asp He Arg Val 

20 25 30 

Ala Arg Val He Glu Gin Asp Met Ala Val Asp Ser Ala Gly Lys He 

35 40 45 

Thr Tyr Arg He Lys Leu Glu Val Ser Phe Lys Met Arg Pro Ala Gin 

50 55 60 

Pro Arg 
65 



(2) INFORMATION FOR SEQ ID NO: 78: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 69 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 



Val Pro Pro Ala Pro Pro Leu Pro Pro Leu Pro Pro Ser Pro He Ser 
15 io 15 
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Cys Ala Ser Pro Pro Ser Pro Pro Leu Pro Pro Ala Pro Pro Val Ala 

20 2S 30 

Pro Gly Pro Pro Met Pro Pro Leu Asp Pro Trp Pro Pro Ala Pro Pro 

Leu Pro Tyr Ser Thr Pro Pro Gly Ala Pro Leu Pro Pro Ser Pro Pro 

e 55 60 

Ser Pro Pro Leu Pro 

65 

(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 355 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 

Met ser Asn Ser Arg Arg Arg Ser Leu Arg Trp Ser Trp Leu Leu Ser 

Val Leu Ala Ala Val Gly Leu Gly Leu Ala Thr Ala Pro Ala Gin Ala 

20 25 30 

Ala Pro Pro Ala Leu Ser Gin Asp Arg Phe Ala Asp Phe Pro Ala Leu 
5 40 45 

Pro Leu Asp Pro Ser Ala Met Val Ala Gin Val Ala Pro Gin Val Val 

50 55 go 

Asn He Asn Thr Lys Leu Gly Tyr Asn Asn Ala Val Gly Ala Gly Thr 

70 »75 
Gly He Val lie Asp Pro Asn Gly Val Val Leu Thr Asn Asn His Val 
85 90 95 



lie Ala Gly Ala Thr Asp He Asn Ala Phe Ser Val Gly Ser Glv Gin 

100 105 110 ' 

Thr Tyr Gly Val Asp Val Val Gly Tyr Asp Arg Thr Gin Asp Val Ala 

120 12 5 
Val Leu Gin Leu Arg Gly Ala Gly Gly Leu Pro Ser Ala Ala lie Gly 

Gly Gly Val Ala Val Gly Glu Pro Val Val Ala Met Gly Asn Ser Gly 
145 150 - 1 



Gly Gin Gly Gly Thr Pro Arg Ala Val Pro Gly Arg Val Val Ala III 

Gly Gin Thr Val Gin Ala Ser Asp Ser Leu Thr Gly Ala Glu gIu Thr 

180 185 190 

Leu Asn Gly Leu lie Gin Phe Asp Ala Ala lie Gin Pro Gly Asp Ser 

195 200 205 

Gly Gly Pro Val Val Asn Gly Leu Gly Gin Val Val Gly Met Asn Thr 

n ^ 215 220 

Ala Ala Ser Asp Asn Phe Gin Leu Ser Gin Gly Gly Gin Gly Phe Ala 
230 235 240 

He Pro lie Gly Gin Ala Met Ala lie Ala Gly Gin lie Arg Ser Gly 

245 250 255 

Gly Gly ser Pro Thr Val His lie Gly Pro Thr Ala Phe Leu Gly Leu 

260 265 270 

Gly Val Val Asp Asn Asn Gly Asn Gly Ala Arg Val Gin Arg Val Val 
275 280 285 
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Gly Ser Ala Pro Ala Ala Ser Leu Gly He Ser Thr Gly Asp Val lie 

290 295 300 

Thr Ala Val Asp Gly Ala Pro He Asn Ser Ala Thr Ala Met Ala Asp 

»? 310 315 320 

Ala Leu Asn Gly His His Pro Gly Asp Val lie Ser Val Asn Trp Gin 

325 330 335 

Thr Lys Ser Gly Gly Thr Arg Thr Gly Asn Val Thr Leu Ala Glu Gly 
340 345 

Pro Pro Ala 
355 

(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS .- 

(A) LENGTH: 205 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

Ser Pro Lys Pro Asp Ala Glu Glu Gin Gly Val Pro Val Ser Pro Thr 

5 10 15 

Ala Ser Asp Pro Ala Leu Leu Ala Glu lie Arg Gin Ser Leu Asp Ala 

20 25 30 

Thr Lys Gly Leu Thr Ser Val His Val Ala Val Arg Thr Thr Gly Lys 

Val Asp Ser Leu Leu Gly lie Thr Ser Ala Asp Val Asp Val Arg Ala 

50 55 60 

Asa Pro Leu Ala Ala Lys Gly Val Cys Thr Tyr Asn Asp Glu Gin Gly 

Val Pro Phe Arg Val Gin Gly Asp Asn lie Ser Val Lys Leu Phe ISp 

8S 90 95 

Asp Trp Ser Asn Leu Gly Ser He Ser Glu Leu Ser Thr Ser Arg Val 

100 105 110 

Leu Asp Pro Ala Ala Gly Val Thr Gin Leu Leu Ser Gly Val Thr Asn 

Leu Gin Ala Gin Gly Thr Glu Val lie Asp Gly lie Ser Thr Thr Lys 
Tl 130 "5 140 

lie Thr Gly Thr lie Pro Ala Ser Ser Val Lys Met Leu Asp Pro Gly 
ISO 155 , 6 r Q 

Ala Lys Ser Ala Arg Pro Ala Thr Val Trp lie Ala Gin Asp Glv Ser 

165 1 ? 0 175 

His His Leu Val Arg Ala Ser He Asp Leu Gly Ser Gly Ser He Gin 

180 185 190 

Leu Thr Gin Ser Lys Trp Asn Glu Pro Val Asn Val Asp 
195 200 205 

(2) INFORMATION FOR SEQ ID NO:8r: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 296 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 

Gly Asp Ser Phe Trp Ala Ala Ala Asp Gin Met Ala Arg Gly Phe Val 
5 10 



Leu Gly Ala Thr Ala Gly Arg Tfar Thr Leu Thr Gly Glu Gly £eu Gin 

20 2S 3 0 

His Ala Asp Gly His Ser Leu Leu Leu Asp Ala Thr Asn Pro Ala Val 

35 40 45 

val Ala Tyr Asp Pro Ala Phe Ala Tyr Glu lie Gly Tyr lie Xaa Glu 

Ser Gly Leu Ala Arg Met Cys Gly Glu Asn Pro Glu Asn He Phe Phe 

70 75 
Tyr lie Thr Val Tyr Asn Glu Pro Tyr Val Gin Pro Pro Glu Pro Glu 

85 90 95 

Asn Phe Asp Pro Glu Gly Val Leu Gly Gly He Tyr Arg Tyr His Ala 

100 105 110 

Ala Thr Glu Gin Arg Thr Asn Lys Xaa Gin lie Leu Ala Ser Gly Val 

115 12 ° 125 

Ala Met Pro Ala Ala Leu Arg Ala Ala Gin Met Leu Ala Ala Glu Trp 

Asp Val Ala Ala Asp Val Trp Ser Val Thr Ser Trp Gly Glu Leu Asn 

Arg Asp Gly Val Val lie Glu Thr Glu Lys lit Arg His Pro Asp Arg 

165 170 175 

Pro Ala Gly Val Pro Tyr Val Thr Arg Ala Leu Glu Asn Ala Arg Gly 

180 190 

Pro Val He Ala Val Ser Asp Trp Met Arg Ala Val Pro Glu Gin lie 

a » t 9S 200 205 

Arg Pro Trp Val Pro Gly Thr Tyr Leu Thr Leu Gly Thr Ast, Glv Phe 

210 215 220 

Gly Phe Ser Asp Thr Arg Pro Ala Gly Arg Arg Tyr Phe Asn Thr Asp 
225 ?3 ° 235 24 o 



Ala Glu Ser Gin Val Glv Arg Gly Phe Gly Arg Gly Trp Pro Gly Arg 

Arg Val Asn He Asp Pro Phe Gly Ala Gly Arg Gly Pro Pro Ala Gin 

260 ->.-»- 



265 



270 



Leu Pro Gly Phe Asp Glu Gly Gly Gly Leu Arg Pro Xaa Lys 
275 280 285 

(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 173 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO : 82 : 

Thr Lys Phe His Ala Leu Met Gin Glu Gin lie His Asn Glu Phe Thr 
Ala Ala Gin Gin Tyr Val Ala lie Ala III Tyr Phe Asp Ser Glu Asp 
Leu Pro Gin Leu Ala Lys His Phe Tyr Ser Gin Ala Val Glu Glu Arg 
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35 40 



Asn His Ala Met Met Leu Val cln His Leu Leu Asp Arg Asp Leu Arg 
val Glu lie Pro Gly Val Asp T hr Val Arg Asn Gin Phe Asp Arg Pro 
Arg Glu Ala Leu Ala Leu Ala Leu Asp Gin llu Arg Thr Val Thr Asp 
Gin Val Gly Arg Leu Thr Ala Val Ala Irg Asp Glu Gl y Asp Phe Leu 
«ly Glu Gin Phe Met Gin Trp Phe £eu Gin Glu Gin He Gl" Glu Val 



Ala Leu Met Ala Thr Leu Val Arg Val Ala Asp Arg Ala Gly Ala Asn 

14 0 

Leu Phe Glu Leu Glu Asn Phe Val Ala Arg Glu Val Asp Val Ala Pro 

155 

Ala Ala ser Gly Ala Pro His Ala Ala Gly Gly Arg Leu 

165 170 
(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 107 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:93: 

Arg Ala Asp Glu Arg Lys Asn Thr Thr Met Lys Met Val Lys Ser lie 

10 

Ala Ala Gly Leu Thr Ala Ala Ala Ala lie Gly Ala Ala Ala aL Gly 
Val Thr ser Xle Met Ala Gly Gly Pro Val Val Tyr Gin £ Gin Pro 
Val val Phe Gly Ala Pro Leu tlo Leu Asp Pro Xaa Ser Ala Pro Xaa 
Val Pro Thr Ala Ala Gin Trp Thr ^ Leu ^ £ ^ ^ ^ ^ 



0 3 7 q s\aa Map 

Pro Asn Val Ser Phe Xaa Asn Lys Gly Ser Leu Val Glu Gly Glv 11 

90 

Gly Gly xaa Glu Gly Xaa Xaa Arg Arg Xaa Gin " 



100 105 



(2) INFORMATION FOR SEQ ID NO: 84: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 125 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 

Val Leu ser Val Pro Val Gly ^ Gly ^ Trp ^ ^ ^ ^ ^ 
Pro Leu Glv Gin Pro He Asp Gly Arg lly ^ Val ^ Ser ^ ^ 
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25 30 



Arg Arg Ala Leu Glu Leu Gin Ala Pro Ser Val Val Xaa Arg Gin Gly 
Val Lys Glu Pro Leu Xaa Thr G^y xie Lys Aia Ile ^ p ^ ^ Thr 
Pro lie Gly Arg Gly Gin Arg Gin Leu lie lie Oly Asp Arg Lys Thr 
Gly Lys Asn Arg Arg Leu Cys Arg Thr Pro Ser Ser Asn Gin Arg 11 
Glu Leu Gly Val Arg Trp He Pro Arg ler Arg Cys Ala Cys Val Tyr 



105 



Val Gly His Arg Ala Arg Arg Gly ^ Tyr His Arg Arg ^ 
115 "0 125 

(2) INFORMATION FOR SEQ ID NO: 85: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 117 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:8S: 

Cys Asp Ala Val Met 01y Phe Leu Gly Qly ^ ^ ^ ^ ^ ^ 
Val Asp Gin Gin Leu Val Thr Arg Val Pro Gin Gly Trp Ser Phe Ala 
Gin Ala Ala Ala Val Pro Val Val Phe Leu Thr Ala Trp £ Gly Leu 
Ala Asp Leu Ala Glu He Lys tL Gly Glu Ser Val Leu lie His Ala 
Gly Thr Gly Gly Val Gly £ t Ala Ala Val Gin IL Ala Arg Gin Trp 
Gly Val G1 U val p he Val Thr Ala Ser ^ £ y Lys ^ ^ ^ ^ 
Arg Ala Xaa Xaa Phe Asp Asp Xaa Pro Tyr Arg Xaa Phe Pro h!s Xaa 



105 110 



100 

Arg Ser Ser Xaa Gly 
115 

(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 103 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 86: 

Jet Tyr Arg Phe Ala Cys Arg Thr Leu Met Leu Ala Ala Cys lie Leu 

Ala Thr Gly Val Ala Gly Leu Gly Val £ y ^ ^ ^ ^ £ ^ 
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Thr Ala Pro Val Pro Asp Tyr ^ Trp ^ prQ ^ ^ ^ ^ ^ 
Pro Ma Trp Gly Pro ^ ^ £ Pro ^ ^ ^ « ^ ^ ^ 
His Arg Asp ser Asp Gly Pro ^ His Ser ^ £ ^ ^ ^ ^ 
He Leu Glu Gly Pro val Leu ^ ^ pro £ y wa 30 

90 

Pro Ala Ala Gly Gly Gly Ala 95 
100 

(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 88 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(X!) SEQUENCE DESCRIPTION : SEQ ID NO: 87: 

Val Gla cys Arg v.l Trp Leu Glu Gln Trp ^ ^ ^ ^ ^ 

-a Asp G m ^ ^ ^ Gly Qly £ ^ ^ ^ ^ „ ^ 

-« Mec Ala Ala Mec Lys P ro Arg Thr Gly Asp Gly Pro £ Glu ^ 
Thr Lys Glu Gly Arg Gly lie £ Met Arg Val Pro Leu Glu Gly Gly 
Gly Arg Leu Val Val Glu Leu Thr Pro Asp Glu £. Ala Ala Leu Gly 
Asp Glu Leu Lys Gly Val Thr Ser 75 



80 

■ -<»j. ulc ser 

85 

!2) INFORMATION FOR SEQ ID NO: 88 : 

(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 95 ammo acids 

(B) type: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 

Thr Asp Ala Ala Thr Leu Ala Gin Glu Ala Gly Asn P he Glu Arg lie 
Ser Gly Asp Leu Lys Thr Gin lie Asp 11 Val Glu Ser Thr Ala Gly 
Ser Leu Gin Gly Gin Trp Arg Gly £ ^ Gly Thr ^ £ ^ ^ 
-a Val val Arg Phe Gln G , u £ ^ ^ ^ ^ £ ^ ^ ^ 

-P Olu lie ser Thr Asn Ue Arg G m Ala Gly £ Gln Tyr Ser Arg 
Ala Asp Glu Glu Gln 01fl 31n Ala ^ £ ^ 80 
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85 90 95 



(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 166 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 

Met Thr Gin Ser Gin Thr Val Thr Val Asp Gin Gin Glu lie Leu Asn 

Arg Ala Asn Glu Val Glu Ala Pro Met Asp Pro Pro Thr Jfp Val 

Pro lie Thr Pro Cys Glu Leu Thr xL Xaa Lys Asn Ala ^a Gin Gin 

Xaa Val Leu Ser Ala Asp Asn Met Arg Glu Tyr Leu IL Ala Gly Ala 

Lys Glu Arg Gin Arg Leu Sa Thr Ser Leu Arg " a Ala Ala Lys Xaa 

Tyr Gly Glu Val Asp Glu Glu Ala Ala Thr Ala Leu Asp Asn Asp S y 

Glu Gly Thr Val Gin Ala Glu Ser Ala G^y Ala Val Gly Gly Isp Ser 

100 105 110 

Ser Aia Glu Leu Thr Asp Thr Pro Arg Val Ala Thr Ala Gly Glu Pro 

120 125 
Asn Phe Met Asp Leu Lys Glu Ala Ala Arg Lys Leu Glu Thr Gly Asp 

Gin Gly Ala Ser Leu Ala His Xaa Gly Asp Gly ^ Asn Thr Xaa Thr 

150 155 1S0 

i-eu Thr Leu Gin Gly Asp 

165 

(2) INFORMATION FOR SEQ ID NO: 90: 

(ii SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 ammo acids 

(B) TYPE: amino acid 

(C) STRAND EDNES S : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 

Arg Ala Glu Arg Met 

1 5 

(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(AJ LENGTH: 263 ammo acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 

Val Ala Trp Met Ser Val Thr Ala Gly Gin Ala Glu Leu Thr Ala Ala 

1 , 5 10 15 

Gin Val Arg Val Ala Ala Ala Ala Tyr Glu Thr Ala Tyr Gly Leu Thr 

20 25 30 

Val Pro Pro Pro Val lie Ala Glu Asn Arg Ala Glu Leu Met lie Leu 

35 40 45 

He Ala Thr Asn Leu Leu Gly Gin Asn Thr Pro Ala He Ala Val Asn 

50 55 go 

Glu Ala Glu Tyr Gly Glu Met Trp Ala Gin Asp Ala Ala Ala Met Phe 

70 75 80 

Gly Tyr Ala Ala Ala Thr Ala Thr Ala Thr Ala Thr Leu Leu Pro Phe 

85 90 95 

Glu Glu Ala Pro Glu Met Thr Ser Ala Gly Gly Leu Leu Glu Gin Ala 

100 105 no 

Ala Ala Val Glu Glu Ala Ser Asp Thr Ala Ala Ala Asn Gin Leu Met 

115 120 12 5 

Asn Asn Val Pro Gin Ala Leu Lys Gin Leu Ala Gin Pro Thr Gin Glv 

° 135 "0 

Thr Thr Pro Ser Ser Lys Leu Gly Gly Leu Trp Lys Thr Val Ser Pro 

" 150 1S5 iso 

His Arg Ser Pro lie Ser Asn Met Val Ser Met Ala Asn Asn His Met 

165 170 17 S 

Ser Met Thr Asn Ser Gly Val Ser Met Thr Asn Thr Leu Ser Ser Me- 

180 18S 190 

Leu Lys Gly Phe Ala Pro Ala Ala Ala Ala Gin Ala Val Gin Thr Ala 

»i ~, 195 200 205 

Ala Gin Asn Gly Val Arg Ala Met Ser Ser Leu Glv Ser Ser Leu Gly 

e 210 215 22 o 

Ser Ser Glv Leu Glv Gly Gly Val Ala Ala Asn Leu Gly Arg Ala Ala 

Z 230 235 240 

Ser Val Arg Tyr Gly His Arg Asp Gly Gly Lys Tyr Ala Xaa Ser Gly 

245 250 255 

Arg Arg Asn Gly Gly p ro Ala 

260 

(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 303 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 

Met Thr Tyr Ser Pro Gly Asn Pro Gly Tyr Pro Gin Ala Gin Pro Ala 
5 10 1S 



Gly Ser Tyr Gly Gly Val Thr Pro Ser Phe Ala His Ala Asp Glu Gly 

20 25 30 

Ala Ser Lys Leu Pro Met Tyr Leu Asn lie Ala Val Ala Val Leu Gly 

35 40 4S 

Leu Ala Ala Tyr Phe Ala Ser Phe Gly Pro Met Phe Thr Leu Ser Thr 
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50 55 60 

Glu Leu Gly Gly Gly Asp Gly Ma Val Ser Q1 ^ 

70 75 80 

Val Gly Val Ala Leu Leu Ala Ala Leu Leu Ala Gly Val Val Leu Val 

85 90 95 

Pro Lys Ala Lys Ser His Val Thr Val Val Ala Val Leu Gly Val Leu 

100 105 uo 

Gly Val Phe Leu Met Val Ser Ala Thr Phe Asn Lys Pro Ser Ala Tyr 

115 120 12 5 

Ser Thx Gly Trp Ala Leu Trp Val Val Leu Ala Phe He Val Phe Gin 

130 "5 140 

Ala Val Ala Ala Val Leu Ala Leu Leu Val Glu Thr Gly Ala He Thx 

150 155 lg0 

Ala Pro Ala Pro Arg Pro Lys Phe Asp Pro Tyr Gly Gin Tyr Gly Arg 

165 170 i7 S 

Tyr Gly Gin Tyr Gly Gin Tyr Gly Val Gin Pro Gly Gly Tyr Tyr Gly 

180 18 5 190 

Gin Gin Gly Ala Gin Gin Ala Ala Gly Leu Gin Ser Pro Gly Pro Gin 

195 200 205 

Gin Ser Pro Gin Pro Pro Gly Tyr Gly Ser Gin Tyr Gly Gly Tyr Ser 

e I „ 215 220 

Ser Ser Pro Ser Gin Ser Gly Ser Gly Tyr Thr Ala Gin Pro Pro Ala 

230 235 
Gin Pro Pro Ala Gin Ser Gly Ser Gin Gin Ser His Gin Gly Pro Ser 

245 250 255 

Thr Pro Pro Thr Gly Phe Pro Ser Phe Ser Pro Pro Pro Pro Val Ser 

265 270 
Ala Gly Thr Gly Ser Gin Ala Gly Ser Ala Pro Val Asn Tyr Ser Asn 

280 2 8 5 

Pro Ser Gly Gly Glu Gin Ser Ser Ser Pro Gly Gly Ala Pro Val 
290 295 300 

(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 amino acids 
IB) TYPE: amino acid 
(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(XI) SEQUENCE DESCRIPTION : SEQ ID NO: 93: 

Gly Cys Gly Glu Thr Asp Ala Ala Thr Leu Ala Gin Glu Ala Gly Asn 

5 10 is 

Phe Glu Arg He Ser Gly Asp Leu Lys Thr Gin He 
20 25 

(2) INFORMATION FOR SEQ ID NO : 94 ; 

Ci) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 16 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 

ASP Gin val Glu Ser Thr Ala Gly Ser Leu Gin Gly Gin Trp Arg Gly 



5 10 15 

(2) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 27 amino acids 

(B) TYPE ; amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 

Gly Cys Gly Ser Thr Ala Gly Ser Leu Gin Gly Gin Trp Arg Gly Ala 

Ala Gly Thr Ala Ala Gin Ala Ala Val Hi Arg 15 
20 25 

(2) INFORMATION FOR SEQ ID NO: 95: 

Ci) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 amino acids 
(BJ TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 

Gly Cys Gly Gly Thr Ala Ala Gin Ala Ala Val Val Arg Phe Gin Glu 

Ala Ala Asn Lys Gin Lys Gin Glu Leu llv Glu " 
20 25 

(2) INFORMATION FOR SEQ ID NO:97 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 27 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
CD) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 

Gly Cys Gly Ala Asn Lys Gin Lys Gin Glu Leu Asp Glu lie Ser Thr 

n 5 10 15 

Asn He Arg Gin Ala Gly Val Gin Tyr Ser Arg 
20 25 

(2) INFORMATION FOR SEQ ID NO : 98 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 8 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

Gly Cys Gly lie Arg Gin Ala Gly Val Gin Tyr Ser Arg Ala Asp Glu 

15 10 15 

Glu Gin Gin Gin Ala Leu Ser Ser Gin Met Gly Phe 
20 25 



(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 507 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE ^DESCRIPTION: SEQ ID NO:99: 

ATGAAGATGG TGAAATCGAT CGCCGCAGGT CTGACCGCCG CGGCTGCAAT CGGCGCCGCT 60 

GCGGCCGGTG TGACTTCGAT CATGGCTGGC GGCCCGGTCG TAT AC CAGAT GCAGCCGGTC 120 

GTCTTCGGCG CGCCACTGCC GTTGGACCCG GCATCCGCCC CTGACGTCCC GACCGCCGCC 180 

CAGTTGACCA GCCTGCTCAA CAGCCTCGCC GATCCCAACG TGTCGTTTGC GAACAAGGGC 240 

AGTCTGGTCG AGGGCGGCAT CGGGGGCACC GAGGCGCGCA TCGCCGACCA CAAGCTGAAG 300 

AAGGCCGCCG AGCACGGGGA TCTGCCGCTG TCGTTCAGCG TGACGAACAT CCAGCCGGCG 360 

GCCGCCGGTT CGGCCACCGC CGACGTTTCC GTCTCGGGTC CGAAGCTCTC GTCGCCGGTC 420 

ACGCAGAACG TCACGTTCGT GAATCAAGGC GGCTGGATGC TGTCACGCGC ATCGGCGATG 480 

GAGTTGCTGC AGGCCGCAGG GAACTGA 5Q7 

(2) INFORMATION FOR SEQ ID NO: 10 0: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 168 amino acids 

(B) TYPE: ammo acid 

(C) STRANDEDNESS: single 

(D ) TOPOLOGY : 1 inear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 
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Met 


Val 


Lys 


Ser 


He 


Ala 


Ala 


Gly Leu Thr Ala Ala 


Ala 


Ala 


Gly 


Ala 
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10 






15 




Ala 


Ala 


Ala 


Gly 


Val 


Thr 


Ser He Met 


Ala 


Gly Gly Pro 


Val 




20 










25 






30 






Tyr 


Gin 


Met 


Gin 


Pro 


Val 


Val 


Phe Gly Ala 


Pro 


Leu 


Pro 


Leu 


Pro 


35 










40 




45 






Ala 


Ser 


Ala 


Pro 


Asp 


Val 


Pro 


Thr Ala Ala 


Gin 


Leu 


Thr 


Ser 


50 










55 






60 








Leu 


Asn 


Ser 


Leu 


Ala 


Asp 


Pro 


Asn 


Val Ser Phe 


Ala 


Asn 


Lys 


Gly 




Val 






70 








75 






80 


Leu 


Glu 


Gly 


Gly 


He 


Gly 


Gly Thr Glu Ala 


Arg 


He 


Ala 


Asp 








85 










90 






95 
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Lys 


Ala 


Ala 


Glu 


His 


Gly Asp Leu 


Pro 


Leu 


Ser 


Phe 
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100 105 no 

Ser Val Thr Asn lie Gin Pro Ala Ala Ala Gly Ser Ala Thr Ala Asp 

115 120 125 

Val Ser Val Ser Gly Pro Lys Leu Ser Ser Pro Val Thr Gin Asn Val 

130 135 140 

Thr Phe Val Asn Gin Gly Gly Trp Met Leu Ser Arg Ala Ser Ala Met 

" 5 150 155 160 

Glu Leu Leu Gin Ala Ala Gly Asn 

165 

(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 500 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

CGTGGCAATG TCGTTGACCG TCGGGGCCGG GGTCGCCTCC GCAGATCCCG TGGACGCGGT 60 

CATTAACACC ACCTGCAATT ACGGGCAGGT AGTAGCTGCG CTCAACGCGA CGGATCCGGG 120 

GGCTGCCGCA CAGTTCAACG CCTCACCGGT GGCGCAGTCC TATTTGCGCA ATTTCCTCGC 180 

CGCACCGCCA CCTCAGCGCG CTGCCATGGC CGCGCAATTG CAAGCTGTGC CGGGGGCGGC 24 0 

ACAGTACATC GGCCTTGTCG AGTCGGTTGC CGGCTCCTGC AACAACTATT AAGCCCATGC 300 

GGGCCCCATC CCGCGACCCG GCATCGTCGC CGGGGCTAGG CCAGATTGCC CCGCTCCTCA 360 

ACGGGCCGCA TCCCGCGACC CGGCATCGTC GCCGGGGCTA GGCCAGATTG CCCCGCTCCT 420 

CAACGGGCCG CATCTCGTGC CGAATTCCTG CAGCCCGGGG GATCCACTAG TTCTAGAGCG 480 
GCCGCCACCG CGGTGGAGCT 

500 

(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 ammo acids 
CB) TYPE: amino acid 
£C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 



(XI) 


SEQUENCE DESCRIPTION: SEQ ID NO 
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Val 
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Tyr Gly Gin Val 
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20 
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30 


Leu 


Asn 


Ala 


Thr 


Asp 


Pro 


Gly 


Ala 


Ala 


Ala 


Gin 


Phe 


Asn 






35 










40 










45 




Pro 


Val 


Ala 


Gin 


Ser 


Tyr 


Leu 


Arg 


Asn 


Phe 


Leu 


Ala 


Ala 


Pro 


Gin 
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60 






Arg 


Ala 


Ala 


Met 


Ala 


Ala* 
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Gin 


Ala 


Val 


Pro 


Gly 
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70 










75 






Gin 


Tyr 


He 


Gly 


Leu 


Val 


Glu 


Ser 


Val 


Ala 


Gly 


Ser 


Cys 


Asn 










85 










90 









15 



80 

Asn Tyi 
95 



(2) INFORMATION FOR SEQ ID NO: 103: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 154 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:103: 

ATGACAGAGC AGCAGTGGAA TTTCGCGGGT ATCGAGGCCG CGGCAAGCGC AATCCAGGGA 60 
AATGTCACGT CCATTCATTC CCTCCTTGAC GAGGGGAAGC AGTCCCTGAC CAAGCTCGCA 120 
GCGGCCTGGG GCGGTAGCGG TTCGGAAGCG TACC 154 

(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:104: 

Met Thr Glu Gin Gin Trp Asn Phe Ala Gly He Glu Ala Ala Ala Ser 

Ala He Gin Gly Asn Val Thr Ser He His Ser Leu Leu Asp Glu Gly 

20 25 30 

Lys Gin Ser Leu Thr Lys Leu Ala Ala Ala Trp Gly Gly Ser Gly Ser 

35 40 45 

Glu Ala Tyr 
50 

(2) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 282 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

CGGTCGCGCA CTTCCAGGTG ACTATGAAAG TCGGCTTCCG NCTGGAGGAT TCCTGAACCT 6 0 

TCAAGCGCGG CCGATAACTG AGGTGCATCA TTAAGCGACT TTTCCAGAAC ATCCTGACGC 12 0 

GCTCGAAACG CGGCACAGCC GACGGTGGCT CCGNCGAGGC GCTGNCTCCA AAATCCCTGA 18 0 

GACAATTCGN CGGGGGCGCC TACAAGGAAG TCGGTGCTGA ATTCGNCGNG TATCTGGTCG 24 0 

ACCTGTGTGG TCTGNAGCCG GACGAAGCGG TGCTCGACGT CG 282 

(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3058 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(DJ TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 
GATCGTACCC GTGCGAGTGC TCGGGCCGTT TGAGGATGGA GTGCACGTGT CTTTCGTGAT 
GGCATACCCA GAGATGTTGG CGGCGGCGGC TGACACCCTG CAGAGCATCG GTGCTACCAC 
TGTGGCTAGC AATGCCGCTG CGGCGGCCCC GACGACTGGG GTGGTGCCCC CCGCTGCCGA 
TGAGGTGTCG GCGCTGACTG CGGCGCACTT CGCCGCACAT GCGGCGATGT ATCAGTCCGT 
GAGCGCTCGG GCTGCTGCGA TTCATGACCA GTTCGTGGCC ACCCTTGCCA GCAGCGCCAG 
CTCGTATGCG GCCACTGAAG TCGCCAATGC GGCGGCGGCC AGCTAAGCCA GGAACAGTCG 
GCACGAGAAA CCACGAGAAA TAGGGACACG TAATGGTGGA TTTCGGGGCG TTACCACCGG 
AGATCAACTC CGCGAGGATG TACGCCGGCC CGGGTTCGGC CTCGCTGG7G GCCGCGGCTC 
AGATGTGGGA CAGCG7GGCG AGTGACCTGT TTTCGGCCGC GTCGGCGTTT CAGTCGGTGG 
TCTGGGGTCT GACGGTGGGG TCGTGGATAG GTTCGTCGGC GGGTCTGATG GTGGCGGCGG 
CCTCGCCGTA TGTGGCGTGG ATGAGCGTCA CCGCGGGGCA GGCCGAGCTG ACCGCCGCCC 
AGGTCCGGGT TGCTGCGGCG GCCTACGAGA CGGCGTATGG GCTGACGGTG CCCCCGCCGG 
TGATCGCCGA GAACCGTGCT GAACTGATGA TTCTGATAGC GACCAACCTC TTGGGGCAAA 
ACACCCCGGC GATCGCGGTC AACGAGGCCG AATACGGCGA GATGTGGGCC CAAGACGCCG 
CCGCGATGTT TGGCTACGCC GCGGCGACGG CGACGGCGAC GGCGACGTTG CTGCCGTTCG 
AGGAGGCGCC GGAGATGACC AGCGCGGGTG GGCTCCTCGA GCAGGCCGCC GCGGTCGAGG 
AGGCCTCCGA CACCGCCGCG GCGAACCAGT TGATGAACAA TGTGCCCCAG GCGCTGCAAC 
AGCTGGCCCA GCCCACGCAG GGCACCACGC CTTCTTCCAA GCTGGGTGGC CTGTGGAAGA 
CGGTCTCGCC GCATCGGTCG CCGATCAGCA ACATGGTGTC GATGGCCAAC AACCACATGT 
CGATGACCAA CTCGGGTGTG TCGATGACCA ACAC CTTG AG CTCGATGTTG AAGGGCTTTG 
CTCCGGCGGC GGCCGCCCAG GCCGTGCAAA CCGCGGCGCA AAACGGGGTC CGGGCGATGA 
GCTCGCTGGG CAGCTCGCTG GGTTCTTCGG GTCTGGGCGG TGGGGTGGCC GCCAACTTGG 
GTCGGGCGGC CTCGGTCGGT TCGTTGTCGG TGCCGCAGGC CTGGGCCGCG GCCAACCAGG 
CAGTCACCCC GGCGGCGCGG GCGCTGCCGC TGACCAGCCT GACCAGCGCC GCGGAAAGAG 
GGCCCGGGCA GATGCTGGGC GGGCTGCCGG TGGGGCAGAT GGGCGCCAGG GCCGGTGGTG 
GGCTCAGTGG TGTGCTGCGT GTTCCGCCGC GACCCTATGT GATGCCGCAT TCTCCGGCGG 
CCGGCTAGGA GAGGGGGCGC AGAC7GTCGT TATTTGACCA GTGATCGGCG GTCTCGGTGT 
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TTCCGCGGCC GGCTATGACA ACAGTCAATG TGCATGACAA GTTACAGGTA TTAGGTCCAG 
GTTCAACAAG GAGACAGGCA ACATGGCCTC ACGTTTTATG ACGGATCCGC ACGCGATGCG 
GGACATGGCG GGCCGTTTTG AGGTGCACGC CCAGACGGTG GAGGACGAGG CTCGCCGGAT 
GTGGGCGTCC GCGCAAAACA TTTCCGGTGC GGGCTGGAGT GGCATGGCCG AGGCGACCTC 
GCTAGACACC ATGGCCCAGA TGAATCAGGC GTTTCGCAAC ATCGTGAACA TGCTGCACGG 
GGTGCGTGAC GGGCTGGTTC GCGACGCCAA CAACTACGAG CAGCAAGAGC AGGCCTCCCA 
GCAGATCCTC AGCAGCTAAC GTCAGCCGCT GCAGCACAAT ACTTTTACAA GCGAAGGAGA 
ACAGGTTCGA TGACCATCAA CTATCAATTC GGGGATGTCG ACGCTCACGG CGCCATGATC 
CGCGCTCAGG CCGGGTTGCT GGAGGCCGAG CATCAGGCCA TCATTCGTGA TGTGTTGACC 
GCGAGTGACT TTTGGGGCGG CGCCGGTTCG GCGGCCTGCC AGGGGTTCAT TACCCAGTTG 
GGCCGTAACT TCCAGGTGAT CTACGAGCAG GCCAACGCCC ACGGGCAGAA GGTGCAGGCT 
GCCGGCAACA ACATGGCGCA AACCGACAGC GCCGTCGGCT CCAGCTGGGC CTGACACCAG 
GCCAAGGCCA GGGACGTGGT GTACGAGTGA AGTTCCTCGC GTGATCCTTC GGGTGGCAG7 
CTAAGTGGTC AGTGCTGGGG TGTTGGTGGT TTGCTGCTTG GCGGGTTCTT CGGTGCTGGT 
CAGTGCTGCT CGGGCTCGGG TGAGGACCTC GAGGCCCAGG TAGCGCCGTC CTTCGATCCA 
TTCGTCGTGT TGTTCGGCGA GGACGGCTCC GACGAGGCGG ATGATCGAGG CGCGGTCGGG 
GAAGATGCCC ACGACGTCGG TTCGGCGTCG TACCTCTCGG TTGAGGCGTT CCTGGGGGTT 
GTTGGACCAG ATTTGGCGCC AGATCTGCTT GGGGAAGGCG GTGAACGCCA GCAGGTCGGT 
GCGGGCGGTG TCGAGGTGCT CGGCCACCGC GGGGAGTTTG TCGGTCAGAG CGTCGAGTAC 
CCGATCATAT TGGGCAACAA CTGATTCGGC GTCGGGCTGG TCGTAGATGG AGTGCAGCAG 
GGTGCGCACC CACGGCCAGG AGGGCTTCGG GGTGGCTGCC ATCAGATTGG CTGCGTAGTG 
GGTTCTGCAG CGCTGCCAGG CCGCTGCGGG CAGGGTGGCG CCGATCGCGG CCACCAGGCC 
GGCGTGGGCG TCGCTGGTGA CCAGCGCGAC CCCGGACAGG CCGCGGGCGA CCAGGTCGCG 
GAAGAACGCC AGCCAGCCGG CCCCGTCCTC GGCGGAGGTG ACCTGGATGC CCAGGATC 
(2) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 91 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
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(D) topology : linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

Met Val Asp ?he Gly Ma Leu pro prQ ne ^ ^ ^ 



10 1S 



Tyr Ala Gly Pro Gly Ser Ala Ser Leu Val Ala Ala Ala Gin Met Trp 

25 30 
Asp ser Val Ala Ser Asp Leu Phe Ser Ala Ala Ser Ala Phe Gin Ser 

40 4S 

Val Val Txp Gly Leu Thr Val Gly Ser Trp He Gly Ser Ser Ala Gly 

55 60 
Leu Met val Ai a Ma ^ a Ser pro ^ ^ ^ ^ ^ ^ ^ ^ 

7 5 80 
Ala Gly Gin Ala Glu Leu Thr Ala Ala Gin Val Ar g Val Ala Ala Ala 

85 9° 95 

Ala Tyr Glu Thr Ala Tyr Gly Leu Thr Val Pro Pro Pro Val lie Ala 

"5 110 

Glu Asn Arg Ala Glu Leu Me C He Leu He Ala Thr Asn Leu Leu Gly 

120 125 

ain Asn Thr Pro Ala He Ala Val Asn Glu Ala Glu Tyr Gly Glu Met 

135 140 
Trp Ala G m Asp Ala Ala Ala Met Phe Gly Tyr Ala Ala Ala Thr Ala 
iS ° "5 160 

Thr Ala Thr Ala Thr Leu Leu Pro Phe Glu Glu Ala Pro Glu Met Thr 
165 170 175 

Ser Ala Gly 3 j y Leu Leu ^ Qlu ^ ^ ^ ^ ^ ^ ^ ^ 

185 190 

Asp Thr Ala Ala Ala Asn Gin Leu Met Asn Asn Val Pro Gin Ala Leu 

200 205 

ain Gin Leu Ala Gin Pro Thr Gin Gly Thr Thr Pro Ser Ser Lys Leu 

215 220 
«y Gly Leu Trp Lys Thr Val Ser Pro His Arg Ser Pro He Ser Asn 

235 240 
Met Val Ser Met Ala Asn Asn H 1S Met Ser Met Thr Asn Ser Gly Val 

250 255 

Ser MeC Thr .Asn Thr Leu Ser Ser Met Leu Lys Gly Phe Ala Pro Ala 

265 27Q 
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Ala Ala Ala Gin Ala Val Gl n Thr U a Ala Gin Asn Gly Val Arg Ala 

280 285 

Met Ser Ser Leu Gly Ser Ser Leu Gly Ser Ser Gly Leu G1 y Gly Gly 

295 300 
Val Ala Ala Asn Leu Gly Arg Ala Ala Ser Val Gly Ser Leu Ser Val 
310 3 " 320 

Pro Gin Ala Trp Ala Ala Ala Asn Gin Ala Val Thr Pro Ala Ala Arg 

325 "0 335 

Ala Leu Pro Leu Thr Ser Leu Thr Ser Ala Ala Glu Arg Gly Pro Gly 

345 350 
Gin net Leu Gly Gly Leu Pro Val Gly ^ ^ ^ ^ ^ ^ ^ 



36 ° 365 



Gly Gly Leu Ser Gly Val Leu Arg Val Pro Pro Arg Pre Tyr Val Met 

375 380 

Pro His Ser Pro Ala Ala Gly 
38S 390 

(2) INFORMATION FOR SEQ ID NO: 108: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 172S base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 
GACGTCAGCA CCCGCCGTGC AGGGCTGGAG CGTGGTCGGT TTTGATCTGC GGTCAAGGTG 
ACGTCCCTCG GCGTGTCGCC GGCGTGGATG CAGACTCGAT GCCGCTCTTT AGTGCAACTA 
ATTTCGTTGA AGTGCCTGCG AGGTATAGGA CTTCACGATT GGTTAATGTA GCGTTCACCC 
CGTGTTGGGG TCGATTTGGC CGGACCAGTC GTCACCAACG CTTGGCGTGC GCGCCAGGCG 
GGCGATCAGA TCGCTTGACT ACCAATCAAT CTTGAGCTCC CGGGCCGATG CTCGGGCTAA 
ATGAGGAGGA GCACGCGTGT CTTTCACTGC GCAACCGGAG ATGTTGGCGG CCGCGGCTGG 
CGAACTTCGT TCCCTGGGGG CAACGCTGAA GGCTAGCAAT GCCGCCGCAG CCGTGCCGAC 
GACTGGGGTG GTGCCCCCGG CTGCCGACGA" GGTGTCGCTG CTGCTTGCCA CACAATTCCG 
TACGCATGCG GCGACGTATC AGACGGCCAG CGCCAAGGCC GCGGTGATCC ATGAGCAGTT 
TGTGACCACG CTGGCCACCA GCGCTAGTTC ATATGCGGAC ACCGAGGCCG CCAACGCTGT 
GGTCACCGGC TAGC7GACCT GACGGTATTC GAGCGGAAGG ATTATCGAAG TGGTGGATTT 
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CGGGGCGTTA CCACCGGAGA TCAACTCCGC GAGGATGTAC GCCGGCCCGG GTTCGGCCTC 
GCTGGTGGCC GCCGCGAAGA TGTGGGACAG CGTGGCGAGT GACCTGTTTT CGGCCGCGTC 
GGCGTTTCAG TCGGTGGTCT GGGGTCTGAC GGTGGGGTCG TGGATAGGTT CGTCGGCGGG 
TCTGATGGCG GCGGCGGCCT CGCCGTATGT GGCGTGGATG AGCGTCACCG CGGGGCAGGC 
CCAGCTGACC GCCGCCCAGG TCCGGGTTGC TGCGGCGGCC TACGAGACAG CGTATAGGCT 
GACGGTGCCC CCGCCGGTGA TCGCCGAGAA CCGTACCGAA CTGATGACGC TGACCGCGAC 
CAACCTCTTG GGGCAAAACA CGCCGGCGAT CGAGGCCAAT CAGGCCGCAT ACAGCCAGAT 
GTGGGGCCAA GACGCGGAGG CGATGTATGG CTACGCCGCC ACGGCGGCGA CGGCGACCGA 
GGCGTTGCTG CCGTTCGAGG ACGCCCCACT GATCACCAAC CCCGGCGGGC TCCTTGAGCA 
GGCCGTCGCG GTCGAGGAGG CCATCGACAC CGCCGCGGCG AACCAGTTGA TGAACAATGT 
GCCCCAAGCG CTGCAACAGC TGGCCCAGCC AGCGCAGGGC GTCGTACCTT CTTCCAAGCT 
GGGTGGGCTG TGGAC3GCGG TCTCGCCGCA TCTGTCGCCG CTCAGCAACG TCAGTTCGAT 
AGCCAACAAC CACATGTCGA TGATGGGCAC GGGTGTGTCG ATGACCAACA CCTTGCACTC 
GATGTTGAAG GGCTTAGCTC CGGCGGCGGC TCAGGCCGTG GAAACCGCGG CGGAAAACGG 
GGTCTGGGCG ATGAGCTCGC TGGGCAGCCA GCTGGGTTCG TCGCTGGGTT CTTCGGGTCT 
GGGCGCTGGG GTGGCCGCCA ACTTGGGTCG GGCGGCCTCG GTCGGTTCGT TGTCGGTGCC 
GCCAGCATGG GCCGCGGCCA ACCAGGCGGT CACCCCGGCG GCGCGGGCGC TGCCGCTGAC 
CAGCCTGACC AGCGCCGCCC AAACCGCCCC CGGACACATG CTGGG 
(2) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 359 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

Val Val Asp Phe Gly Ala Leu Pro Pro Glu lie Asn Ser Ala Arg Met 

10 15 

Tyr Ala Gly Pro Gl y Ser Ala Ser Leu Val Ala Ala Ala Lys Me t Trp 

25 30 
ASP Ser Val Ala Ser Asp Leu Phe Ser Ma ^ Sgr ^ ^ ^ 



720 
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Val Val Trp Gly Leu Thr Val Gly Ser Trp ne ay ^ ^ 

55 60 
Leu Met Ala Ala Ala Ala Ser Pro Tyr Val Ala Trp Met Ser Val Thr 

70 75 80 

Ala Gly Gb Ala Gin Leu Thr Ala Ala Gin Val Arg Val Ala Ala Ala 

85 9° 95 

Ala Tyr Glu Thr Ala Tyr Arg Leu Thr Val Pro Pro Pro Val lie Ala 
100 "5 uo 

Glu Asa Arg Thr Glu Leu Met Thr Leu Thr Ala Thr Asn Leu Leu Gly 

120 125 

aia Asn Thr Pro Ala lie Glu Ala Asn Gin Ala Ala Tyr Ser Gin Met 

135 140 

Trp Gly Gin Asp Ala Glu Ala Met Tyr Gly Tyr Ala Ala Thr Ala Ala 



150 



155 



160 



Thr Ala Thr Glu Ala Leu Leu Pro Phe Glu Asp Ala Pr, 



165 



170 



o Leu lie Thr 



175 

Asn Pro Gly Gly Leu Leu Glu Gin Ala Val Ala Val Glu Glu Ala lie 

185 190 

Asp Thr Ala Ala Ala Asn Gin Leu Met Asn Asn Val Pro Gin Ala Leu 
195 200 205 

Gin Gin Leu Ala Gin Pro Ala Gin Gly Val Val Pro Ser Ser Lys Leu 



2 " 220 



Gly Gly Leu Trp Thr Ala Val Ser Pro His Leu Ser Pro Leu Ser Asn 



235 



40 

val ser Ser lie Ala Asn Asn His Met Ser Met Met Gly Thr Gly Val 
245 250 255 

Ser Met Thr Asn Thr Leu His Ser Met Leu Lys Gly Leu Ala Pro Ala 

265 270 

Ala Ala Gin Ala Val Glu Thr Ala Ala Glu Asn Gly Val Trp Ala Met 
275 280 285 

Ser Ser Leu Gly Ser Gin Leu Gly Ser Ser Leu Gly Ser Ser Gly Leu 

295 _ 300 

Gly Ala Gly val Ala Ala Asn Leu Gly Arg Ala Ala Ser Val Gly Ser 
310 320 
Leu Ser Val Pro Pro Ala Trp Ala Ala Ala Asn Gin Ala Val Thr Pro 



325 



330 



335 
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Ala Ala Arg Ala Leu Pro Leu Thr Ser Leu Thr Ser Ala Ala Gin Thr 

340 i a c 

34 5 350 

Ala Pro Gly His Met Leu Gly 
355 

(2) INFORMATION FOR SEQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS: 

(AJ LENGTH: 3027 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 
AGTTCAGTCG AGAATGATAC TGACGGGCTG TATCCACGAT GGCTGAGACA ACCGAACCAC 
CGTCGGACGC GGGGACATCG CAAGCCGACG CGATGGCGTT GGCCGCCGAA GCCGAAGCCG 
CCGAAGCCGA AGCGCTGGCC GCCGCGGCGC GGGCCCGTGC CCGTGCCGCC CGGTTGAAGC 
GTGAGGCGCT GGCGATGGCC CCAGCCGAGG ACGAGAACGT CCCCGAGGAT ATGCAGACTG 
GGAAGACGCC GAAGACTATG ACGACTATGA CGACTATGAG GCCGCAGACC AGGAGGCCGC 
ACGGTCGGCA TCCTGGCGAC GGCGGTTGCG GGTGCGGTTA CCAAGACTGT CCACGATTGC 
CATGGCGGCC GCAGTCGTCA TCATCTGCGG CTTCACCGGG CTCAGCGGAT ACATTGTGTG 
GCAACACCAT GAGGCCACCG AACGCCAGCA GCGCGCCGCG GCGTTCGCCG CCGGAGCCAA 
GCAAGGTGTC ATCAACATGA CCTCGCTGGA CTTCAACAAG GCCAAAGAAG ACGTCGCGCG 
TGTGATCGAC AGCTCCACCG GCGAATTCAG GGATGACTTC CAGCAGCGGG CAGCCGATTT 
CACCAAGGTT GTCGAACAGT CCAAAGTGGT CACCGAAGGC ACGGTGAACG CGACAGCCGT 
CGAATCCATG AACGAGCATT CCGCCGTGGT GCTCGTCGCG GCGACTTCAC GGGTCACCAA 
TTCCGCTGGG GCGAAAGACG AACCACGTGC GTGGCGGCTC AAAGTGACCG TGACCGAAGA 
GGGGGGACAG TACAAGATGT CGAAAGTTGA GTTCGTACCG TGACCGATGA CGTACGCGAC 
GTCAACACCG AAACCACTGA CGCCACCGAA GTCGCTGAGA TCGACTCAGC CGCAGGCGAA 
GCCGGTGATT CGGCGACCGA GG CATTTGAC ACCGACTCTG CAACGGAATC TACCGCGCAG 
AAGGGTCAGC GGCACCGTGA CCTGTGGCGA ATGCAGGTTA CCTTGAAACC CGTTCCGGTG 
ATTCTCATCC TGCTCATGTT GATCTCTGGG GGCGCGACGG GATGGCTATA CCTTGAGCAA 
TACGACCCGA TCAGCAGACG GACTCCGGCG CCGCCCGTGC TGCCGTCGCC GCGGCGTCTG 
ACGGGACAAT CGCGCTGTTG TGTATTCACC CGACACGTCG ACCAAGACTT CGCTACCGCC 
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AGGTCGCACC TCGCCGGCGA rTTCCTGTCC TATACGACCA GTTCACGCAG CAGATCGTGG 
CTCCGGCGGC CAAACAGAAG TCACTGAAAA CCACCGCCAA GGTGGTGCGC GCGGCCGTGT 
CGGAGCTACA TCCGGATTCG GCCGTCGTTC TGGTTTTTGT CGACCAGAGC ACTACCAGTA 
AGGACAGCCC CAATCCGTCG ATGGCGGCCA GCAGCGTGAT GGTGACCCTA GCCAAGGTCG 
ACGGCAATTG GCTGATCACC AAGTTCACCC CGGTTTAGGT TGCCGTAGGC GGTCGCCAAG 
TCTGACGGGG GCGCGGGTGG CTGCTCGTGC GAGATACCGG CCGTTCTCCG GACAATCACG 
GCCCGACCTC AAACAGATCT CGGCCGCTGT CTAATCGGCC GGGTTATTTA AGATTAGTTG 
CCACTGTATT TACCTGATGT TCAGATTGTT CAGCTGGATT TAGCTTCGCG GCAGGGCGGC 
TGGTGCACTT TGCATCTGGG GTTGTGACTA CTTGAGAGAA TTTGACCTGT TGCCGACGTT 
GTTTGCTGTC CATCATTGGT GCTAGTTATG GCCGAGCGGA AGGATTATCG AAGTGGTGGA 
CTTCGGGGCG TTACCACCGG AGATCAACTC CGCGAGGATG TACGCCGGCC CGGGTTCGGC 
CTCGCTGGTG GCCGCCGCGA AGATGTGGGA CAGCGTGGCG AGTGACCTGT TTTCGGCCGC 
GTCGGCGTTT CAGTCGGTGG 7CTGGGGTCT GACGACGGGA TCGTGGATAG GTTCGTCGGC 
GGGTCTGATG GTGGCGGCGG CCTCGCCGTA TGTGGCGTGG ATGAGCGTCA CCGCGGGGCA 
GGCCGAGCTG ACCGCCGCCC AGGTCCGGGT TGCTGCGGCG GCCTACGAGA CGGCGTATGG 
GCTGACGGTG CCCCCGCCGG TGATCGCCGA GAACCGTGCT GAACTGATGA TTCTGATAGC 
GACCAACCTC TTGGGGCAAA ACACCCCGGC GATCGCGGTC AACGAGGCCG AATACGGGGA 
GATGTGGGCC CAAGACGCCG CCGCGATGTT TGGCTACGCC GCCACGGCGG CGACGGCGAC 
CGAGGCGTTG CTGCCGTTCG AGGACGCCCC ACTGATCACC AACCCCGGCG GGCTCCTTGA 
GCAGGCCGTC GCGGTCGAGG AGGCCATCGA CACCGCCGCG GCGAACCAGT TGATGAACAA 
TGTGCCCCAA GCGCTGCAAC AACTGGCCCA GCCCACGAAA AGCATCTGGC CGTTCGACCA 
ACTGAGTGAA CTCTGGAAAG CCATCTCGCC GCATCTGTCG CCGCTCAGCA ACATCGTGTC 
GATGCTCAAC AACCACGTGT CGATGACCAA CTCGGGTGTG TCGATGGCCA GCACCTTGCA 
CTCAATGTTG AAGGGCTTTG CTCCGGCGGC GGCTCAGGCC GTGGAAACCG CGGCGCAAAA 
CGGGGTCCAG GCGATGAGCT CGCTGGGCAG CCAGCTGGGT TCGTCGCTGG GTTCTTCGGG 
TCTGGGCGCT GGGGTGGCCG CCAACTTGGG TCGGGCGGCC TCGGTCGGTT CGTTGTCGGT 
GCCGCAGGCC TGGGCCGCGG CCAACCAGGC GGTCACCCCG GCGGCGCGGG CGCTGCCGCT 
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GACCAGCCTG ACCAGCGCCG CCCAAACCGC CCCCGGACAC ATGCTGGGCG GGCTACCGCT 2S80 
GGGGCAACTG ACCAATAGCG GCGGCGGGTT CGGCGGGGTT AGCAATGCGT TGCGGATGCC 2940 
GCCGCGGGCG TACGTAATGC CCCGTGTGCC CGCCGCCGGG TAACGCCGAT CCGCACGCAA 3000 
TGCGGGCCCT CTATGCGGGC AGCGATC 
(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 396 amino acids 

(B) TYPE: amino acid 

( C ) STRAND EDNESS : 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 111: 

val val Asp P he Gly Ala Leu Pro Pro Glu He Asn Ser Ala Arg Met 

10 15 

Tyr Ala Gly P ro Gly Ser Ala Ser Leu Val Ala Ala Ala Lys Met Trp 

25 30 

Asp ser Val Ala Ser Asp Leu Phe Ser Ala Ala Ser Ala Phe Gin Ser 
5 40 45 

Val Val Trp Gly Leu Thr Thr Gly Ser Trp lie Gly Ser Ser Ala Gly 

55 60 
Leu Met Val Ala Ala Ala Ser Pro Tyr Val Ala Trp „ ec Ser Val Thr 

70 7 5 g 0 

Ala Gly Gin Ala Glu Leu Thr Ala Ala Gin Val Arg Val Ala Ala Ala 
85 *° 95 

Ala Tyr Glu Thr Ala Tvr Glv i Th, ,,,, „ 

100 Pr ° Pr ° Val Ile 

105 110 

Glu Asn Arg Ala Glu Leu Met He Leu xi e Ala Thr Asn Leu Leu Gly 

120 125 

Gin Asn Thr Pro Ala He Ala Val Asn Glu Ala Glu Tyr Gly Glu Met 

135 140 

Trp Ala Gin Asp Ala Ala Ala Met Phe Gly Tyr Ala Ala Thr Ala Ala 

50 155 160 

Thr Ala Thr Glu Ala Leu Leu Pro Phe Glu Asp Ala Pro Leu lie Thr 

165 17 ° 175 

Asn Pro Gly Gly Leu Leu Glu Gin Ala Val Ala Val Glu Glu Ala lie 

185 190 
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195 



200 



205 



Gin Gin Leu Ala Gin Pro Thr Lys Ser Ile Trp ^ ^ ^ ^ ^ 

215 220 

Ser aiu Leu Trp Lys Ala lie Ser Pro His Leu Ser Pro Leu Ser Asn 
230 "5 240 

He val Ser Met Leu Asn Asn His Val Ser Met Thr Asn Ser Gly Val 
245 250 255 

Ser Met Ala Ser Thr Leu His Ser Met Leu Lys Gly Phe Ala Pro Ala 
260 265 2 70 

Ala Ala Gin Ala Val Glu Thr Ala Ala Gin Asn Gly Val Gin Ala Met 
275 280 28S 

Ser ser Leu Gly Ser Gin Leu Gly Ser Ser Leu Gly Ser Ser Glv Leu 

295 300 

Gly Ala Gly Val Ala Ala Asn Leu Gly Arg Ala Ala Ser Val Gly Ser 
310 315 320 

Leu ser Val Pro Gin Ala Trp Ala Ala Ala Asn Gin Ala Val Thr Pro 
325 330 335 

Ala Ala Arg Ala Leu Pro Leu Thr Ser Leu Thr Ser Ala Ala Gin Thr 
340 34S 350 

Ala Pro Gly His Met Leu Gly Gly Leu Pro Leu Gly Gin Leu Thr Asn 
355 360 365 

Ser Gly Gly Gly Phe Gly Gly Val Ser Asn Ala Leu Arg Met Pro Pro 

375 380 

Arg Ala Tyr Val Met Pro Arg Val Pro Ala Ala Gly 
385 395 
(2) INFORMATION FOR SEQ ID NO: 112 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1616 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:112: 
CATCGGAGGG AGTGATCACC ATGCTGTGGC ACGCAATGCC ACCGGAGTAA ATACCGCACG 
GCTGATGGCC GGCGCGGGTC CGGCTCCAAT GCTTGCGGCG GCCGCGGGAT GGCAGACGCT 
TTCGGCGGCT CTGGACGCTC AGGCCGTCGA GTTGACCGCG CGCCTGAACT CTCTGGGAGA 
AGCCTGGACT GGAGGTGGCA GCGACAAGGC GCTTGCGGCT GCAAC3CCGA TGGTGGTCTG 
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GCTACAAACC GCGTCAACAC AGGCCAAGAC CCGTGCGATG CAGGCGACGG CGCAAGCCGC 
GGCATACACC CAGGCCATGG CCACGACGCC GTCGCTGCCG GAGATCGCCG CCAACCACAT 
CACCCAGGCC GTCCTTACGG CCACCAACTT CTTCGGTATC AACACGATCC CGATCGCGTT 
GACCGAGATG GATTATTTCA TCCGTATGTG GAACCAGGCA GCCCTGGCAA TGGAGGTCTA 
CCAGGCCGAG ACCGCGGTTA ACACGCTTTT CGAGAAGCTC GAGCCGATGG CGTCGATCCT 
TGATCCCGGC GCGAGCCAGA GCACGACGAA CCCGATCTTC GGAATGCCCT CCCCTGGCAG 
CTCAACACCG GTTGGCCAGT TGCCGCCGGC GGCTACCCAG ACCCTCGGCC AACTGGGTGA 
GATGAGCGGC CCGATGCAGC AGCTGACCCA GCCGCTGCAG CAGGTGACGT CGTTGTTCAG 
CCAGGTGGGC GGCACCGGCG GCGGCAACCC AGCCGACGAG GAAGCCGCGC AGATGGGCCT 
GCTCGGCACC AGTCCGCTGT CGAACCATCC GCTGGCTGGT GGATCAGGCC CCAGCGCGGG 
CGCGGGCCTG CTGCGCGCGG AGTCGCTACC TGGCGCAGGT GGGTCGTTGA CCCGCACGCC 
GCTGATGTCT CAGCTGATCG AAAAGCCGGT TGCCCCCTCG GTGATGCCGG CGGCTGCTGC 
CGGATCGTCG GCGACGGGTG GCGCCGCTCC GGTGGGTGCG GGAGCGATGG GCCAGGGTGC 
GCAATCCGGC GGCTCCACCA GGCCGGGTCT GGTCGCGCCG GCACCGCTCG CGCAGGAGCG 
TGAAGAAGAC GACGAGGACG ACTGGGACGA AGAGGACGAC TGGTGAGCTC CCGTAATGAC 
AACAGACTTC CCGGCCACCC GGGCCGGAAG ACTTGCCAAC ATTTTGGCGA GGAAGGTAAA 
GAGAGAAAGT AGTC^GCAT GGCAGAGATG AAGACCGATG CCGCTACCCT CGCGCAGGAG 
GCAGGTAATT TCGAGCGGAT CTCCGGCGAC CTGAAAACCC AGATCGACCA GGTGGAGTCG 
ACGGCAGGTT CGTTGCAGGG CCAOTGGCGC GGCGCGGCGG GGACGGCCGC CCAGGCCGCG 
GTGGTGCGCT TCCAAGAAGC AGCCAATAAG CAGAAGCAGG AACTCGACGA GATCTCGACG 
AATATTCGTC AGGCCGGCGT CCAATACTCG AGGGCCGACG AGGAGCAGCA GCAGGCGCTG 
TCCTCGCAAA TGGGCTTCTG ACCCGCTAAT ACGAAAAGAA ACGGAGCAAA AACATGACAG 
AGCAGCAGTG GAATTTCGCG GGTATCGAGG CCGCGGCAAG CGCAATCCAG GGAAAT 
(2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS * 

(A) LENGTH: 432 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 
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Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 
CTAGTGGATG GGACCATSGC CATTTTCTGC AGTCTCACTG CCTTCTGTGT TGACATTTTG 
GCACGCCGGC GGAAACGAAG CACTGGGGTC GAAGAACGGC TGCGCTGCCA TATCGTCCGG 
AGCTTCCATA CCTTCGTGCG GCCGGAAGAG CTTGTCGTAG TCGGCCGCCA TGACAACCTC 
TCAGAGTGCG CTCAAACGTA TAAACACGAG AAAGGGCGAG ACCGACGGAA GGTCGAACTC 
GCCCGATCCC GTGTTTCGCT ATTCTACGCG AACTCGGCGT TGCCCTATGC GAACATCCCA 
GTGACGTTGC CTTCGGTCGA AGCCATTGCC TGACCGGCTT CGCTGATCGT CCGCGCCAGG 
TTCTGCAGCG CGTTGTTCAG CTCGGTAGCC GTGGCGTCCC ATTTTTGCTG GACACCCTGG 
TACGCCTCCG AA 

(2) INFORMATION FOR SEQ ID NO: 114 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 368 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION : SEQ ID NO: 114: 

Met Leu Trp Hi S Ala Mec Pro Pro Glu Xaa ^ Thr ^ ^ 

10 15 
Ala Gly Ala Gly Pro . Ua Pro Met Leu ^ ^ ^ ^ ^ ^ ^ 

25 30 

Thr Leu Ser Ala Ala Leu Asp Ala Gin Ala Val Glu Leu Thr Ala Arg 

40 45 

UU sT ^ ^ °* CIV Ser Asp Lys Ala 

35 60 

L eu Ala Ala Ala Thr Pro Me. Val Val Trp Leu Gin Thr Ala Ser Thr 

75 80 
Clin Ala Lys Thr Arg Ala „ et Gin Ala Thr Ala Gin Ala Ala Ala Tyr 

so 95 

Thr Gin Ala Me, Ala Thr Thr Pro Ser Leu Pro Glu He Ala Ala Asn 

105 110 
His He Thr Gin Ala V al Leu Thr ^ Thr ^ ^ ^ 

120 125 

Thr lie Pro He Ala Leu Thr Glu Mec Asp Tyr Phe lie ^ Mec Trp 

140 
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Asn Gin Ala Ala Leu Ala Met Glu Val Tyr Gin Ala Glu Thr Ala Val 
150 155 160 

Asn Thr Le u Phe Glu Lys Leu Glu Pro Met Ala Ser He Leu Asp Pro 

155 17 ° 175 

Oly Ala ser Gin Ser Thr Thr ^ Pro lle phe Qly ^ ^ ^ ^ 

185 190 

Qly S - Ser Thr Pro Val Gly Gin Leu Pro Pro Ala Ala Thr Gin Thr 

200 205 

Leu Gly Gin Leu Gly Glu Met Ser Gly Pro Met Gin Gin Leu Thr Gin 

215 220 
Pro Leu Gin Gin Val Thr Ser Leu Phe Ser Gin Val Gly Gly Thr 01y 
- 3 ° 235 240 

Gly Gly Asn Pro Ala Asp Glu Glu Ala Ala Gin Met Gly Leu Leu Gly 

45 250 255 

Thr Ser Pro Leu Ser Asn His Pro Leu Ala Gly Gly Ser Gly Pro Ser 

265 270 

Ala Gly Ala Gly Leu Leu Arg Ala Glu Ser Leu Pro Gly Ala Gly Gly 

280 28s 

Ser Leu Thr Arg Thr Pro Leu Met Ser Gin Leu lie Glu Lys Pro Val 



295 300 



Ala Pro Ser Val 



Met Pro Ala Ala Ala Ala Gly Ser Ser Ala Thr Gly 
10 3 " 320 

Gly Ala Ala Pro Val Gly ^ Gly Ala ^ Qly ^ Qly ^ ^ ^ 

5 330 335 

Gly Gly ser Thr Arg p„ Gly Leu Val ^ prQ ^ ^ ^ ^ 

34S 350 
Glu Arg Glu Glu Asp Asp Glu Asp Asp Trp ^ p Glu Glu Agp ^ ^ 

360 365 

(2) INFORMATION FOR SEQ ID NO : 1X5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: loo amino acids 
CB) TYPE: ammo acid 

(C) STRAND EDNES S : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 115 : 
Met Ala Glu Met Lvs Thr A^n n a m ^ 

: - ^nr Asp Ala Ala Thr Leu Ala Gin Glu Ala Gly 

io 15 
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Asn Phe Glu Arg He Ser Gly Asp Leu Lys Thr Gln u- ^ 

° 25 30 

Glu Ser Thr Ala Gly Ser Leu Gln Gly ^ ^ ^ ^ ^ ^ ^ 

40 45 

Thr Ala Ala Glu Ala Ala Val Val Arg Phe Gln Glu Ala Ala Asa Lys 

55 60 

Gln Lys Gln Glu Leu Asp Glu lie q-r- Thr- »„„ t-i , 

65 ,„ AJ - e Ser Tnr He Arg Gln Ala Gly 

70 7 5 80 

Val Gln Tyr Ser Arg Ala Asp Glu Glu Gln Gln Gln Ala Leu Ser Ser 
" 9° 95 

Gln Met Gly Phe 
100 

(2) INFORMATION FOR SEQ ID NO: 115 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 396 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116: 
OATCTCCGGC GACCTGAAAA CCCAGATCGA CCAGGTGGAG TCGACGGCAG GTTCGTTGCA 
GGGCCAGTGG CGCGGCGCGG CGGGGACGGC CGCCCAGGCC GCGGTGGTGC GCTTCCAAGA 
AGCAGCCAAT AAGCAGAAGC AGGAACTCGA CGAGATCTCG ACGAATATTC GTCAGGCCGG 
CGTCCAATAC TCGAGGGCCG ACGAGGAGCA GCAGCAGGCG CTGTCCTCGC AAATGGGCTT 
CTGACCCGCT AATACGAAAA GAAACGGAGC AAAAACATGA CAGAGCAGCA GTGGAATTTC 
GCGGGTATCG AGGCCGCGGC AAGCGCAATC CAGGGAAATG TCACGTCCAT TCATTCCCTC 
CTTGACGAGG GGAAGCAGTC CCTGACCAAG CTCGCA 
(2) INFORMATION FOR SEQ ID NO: 117 : 

(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 80 amino acids 

(B) TYPE: ammo acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(X!) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 

lie ser Gly Asp Leu Lys Thr Gin lie Asp Gln Val Glu Ser Thr Ala 
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Gly Ser Leu Gin Gly Gin Trp Arg Gly Ala Ala Gly Thr Ala Ala Gin 
20 25 30 

Ala Ala Val Val Arg Pne Gin Glu Ala Ala Asn Lys Gin Lys Gin Glu 

40 45 

Leu Asp Glu lie Ser Tk r Asn He Arg Gin Ala Gly Val Gin Tyr Ser 

55 60 

Arg Ala Asp Glu Glu Gin Gin Gin Ala Leu Ser Ser Gin Met Gly Phe 
70 7S 



SO 



(2) INFORMATION FOR SEQ ID NO: 118 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 87 base oairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118 : 
GTGGATCCC3 ATCCCGTGTT TCGCTATTCT ACGCGAACTC GGCGTTGCCC TATGCGAACA 
TCCCAGTGAC GTTGCCTTCG GTCGAAGCCA TTGCCTGACC GGCTTCGCTG ATCGTCCGCG 
CCAGGTTCTG CAGCGCGTTG TTCAGCTCGG TAGCCGTGGC GTCCCATTTT TGCTGGACAC 
CCTGGTACGC CTCCGAACCG CTACCGCCCC AGGCCGCTGC GAGCTTGGTC AGGGACTGCT 
TCCCCTCGTC AAGGAGGGAA TGAATGGACG TGACATTTCC CTGGATTGCG CTTGCCGCGG 
CCTCGATACC CGCGAAATTC CACTGCTG CT CTGTCATGTT TTTGCTCCGT TTCTTTTCGT 
ATTAGCGGGT CAGAAGCCCA TTTGCGA 
(2) INFORMATION FOR SEQ ID NO: 119 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 272 base'oairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:119 : 
CGGCACGAGG ATCTCGGTTG GCCCAACGGC- GCTGGCGAGG GCTCCGTTCC GGGGGCGAGC 
TGCGCGCCGG ATGCTTCCTC TGCCCGCAGC CGCGCCTGGA TGGATGGACC AGTTGCTACC 
TTCCCGACGT TTCGTTCGGT GTCTGTGCGA TAGCGGTGAC CCCGGCGCGC ACGTCGGGAG 
TGTTGGGGGG CAGGCCGGGT CGGTGGTTCG GCCGGGGACG CAGACGGTCT GGACGGAACG 
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GGCGGGGGTT CGCCGATTGG CATCTTTGCC CA 
(2) INFORMATION FOR SEQ ID NO : 120 : 

(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 20 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

<*i) SEQUENCE DESCRIPTION : SEQ ID NO:120: 

-p Pro Val Asp Ala Val lie Asn T*r Tfcr Cys Asn Tyr Gly Gln Val 

J in 



272 



Val Ala Ala Leu 

20 



io 1S 



(2) INFORMATION FOR SEQ ID NO: 121: 

(iJ SEQUENCE CHARACTERISTICS - 

(A) LENGTH: is amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 

Ala Val Glu ser Gly Met Leu Ala Leu Gly Thr Pro Ala Pro Ser 
S 10 
(2) INFORMATION FOR SEQ ID NO : 122 : 

(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY .- linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122 : 

Ala Ala Mec L ys Pro ^ Thr Qly ^ &y ^ ^ ^ ^ ^ 

D in J 

Glu Gly Arg 

(2) INFORMATION FOR SEQ ID NO: 12 J: 

(iJ SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 15 ammo acids 
fB) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY; linear 



10 ls 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123 : 
Tyr Tyr Trp Cys Pro Gly Gln Pro phe ^ prQ ^ 

io 15 

(2) INFORMATION FOR SEQ ID NO: 124: 

(i) SEQUENCE CHARACTERISTICS - 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 

ASP He Gly s.r Glu Ser Tnr Glu Asp Gin G ln Xaa Ma ^ 
= 10 
(2) INFORMATION FOR SEQ ID NO: 125: 

(i) SEQUENCE CHARACTERISTICS - 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: 

Ala Glu Glu ser r le Ser Thr Xaa Glu Xaa lie Val Pro 
3 10 
(2) INFORMATION FOR SEQ ID NO: 126: 

(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY : linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126: 

Pro Glu pro Ala Pro Pro Val Pro Th r Th r Ala Ala Ser Pro Pro 



io 1S 



Ser 



(2) INFORMATION FOR SEQ l D NO:12T: 

(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY : linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 

Ala Pro Lys Thr Tyr Xaa Glu Glu Leu Lys Gly Thr Asp Thr Gly 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 128: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 
£ C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 

Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 128: 

Asp Pro Ala Ser Ala Pro Asp Val Pro Thr Ala Ala Gin Leu Thr Ser 

Leu Leu Asn Ser Leu Ala Asp Pro Asn Val Ser Phe Ala Asn 

20 25 30 

(2) INFORMATION FOR SEQ ID NO: 129: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: ammo acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 129: 

Asp Pro Pro Asp Pro His Gin Xaa Asp Met Thr Lys Gly Tyr Tyr Pro 
5 10 15 

Gly Gly Arg Arg Xaa Phe 
20 



(2) INFORMATION FOR SEQ ID NO: 13 0: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 

Asp Pro Gly Tyr Thr Pro Gly 
1 5 

(2) INFORMATION FOR SEQ ID NO: 131: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 ammo acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

Pro or Thr^ ^ X ^°^^ Second Residue Can Be Either 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 

Xaa Xaa Gly Phe Thr Gly Pro Gin Phe Tyr 
15 10 

(2) INFORMATION FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

Gin or Leu"' 0 ' ^ INFORMATION : /note = -The Third Residue Can Be Either 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132: 

Xaa Pro Xaa Val Thr Ala Tyr Ala Gly 
1 5 

(2) INFORMATION FOR SEQ* ID NO : 133 : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 9 amino acids 
<B) TYPE: ammo acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13 3: 

Xaa Xaa Xaa Glu Lys Pro Phe Leu Arg 

1 5 

(2) INFORMATION FOR SEQ ID NO: 134: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: ammo acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:13 4: 

Xaa Asp Ser Glu Lys Ser Ala th^ ti a t , 

Y> ^er Ala Thr lie Lys Val Thr Asp Ala Ser 

10 15 
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(2) INFORMATION FOR SEQ ID NO: 135: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135 : 

Ala Qi y Asp Thr Xaa He Tyr lie Val Gly Asn Leu Thr Ala Asp 
5 10 15 

(2) INFORMATION FOR SEQ ID NO: 13 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:136: 

Ala Pro Glu Ser Gly Ala Gly Leu Gly Gly Thr Val Gin Ala Gly 



10 



15 



(2) INFORMATION FOR SEQ ID NO: 137 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 
<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:137: 

Xaa Tyr i ie Ala Tyr Xaa Thr Thr Ala Gly lie Val Pro Gly Lys lie 

Asn Val His Leu Val 
20 

(2) INFORMATION FOR SEQ ID NO: 13 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 882 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:138: 
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GCAACGCTGT 


CGTGGCCTTT 


GCGGTGATCG 


GTTTCGCCTC 


GCTGGCGGTG 


GCGGTGGCGG 


60 


TCACCATCCG 


ACCGACCGCG 


GCCTCAAAAC 


CGGTAGAGGG 


ACACCAAAAC 


GCCCAGCCAG 


120 


GGAAGTTCAT 


GCCGTTGTTG 


CCGACGCAAC 


AGCAGGCGCC 


GGTCCCGCCG 


CCTCCGCCCG 


180 


ATGATCCCAC 


CGCTGGATTC 


CAGGGCGGCA 


CCATTCCGGC 


TGTACAGAAC 


GTGGTGCCGC 


240 


GGCCGGGTAC 


CTCACCCGGG 


GTGGGTGGGA 


CGCCGGCTTC 


GCCTGCGCCG 


GAAGCGCCGG 


300 


CCGTGCCCGG 


TGTTGTGCCT 


GCCCCGGTGC 


CAATCCCGGT 


CCCGATCATC 


ATTCCCCCGT 


360 


TCCCGGGTTG 


GCAGCCTGGA 


ATGCCGACCA 


TCCCCACCGC 


ACCGCCGACG 


ACGCCGGTGA 


420 


CCACGTCGGC 


GACGACGCCG 


CCGACCACGC 


CGCCGACCAC 


GCCGGTGACC 


ACGCCGCCAA 


480 


CGACGCCGCC 


GACCACGCCG 


GTGACCACGC 


CGCCAACGAC 


GCCGCCGACC 


ACGCCGGTGA 


540 


CCACGCCACC 


AACGACCGTC 


GCCCCGACGA 


CCGTCGCCCC 




1 LLGACCA 


600 


CCGTCGCCCC 


GACCACGGTC 


GCTCCAGCCA 


CCGCCACGCC 


GACGACCGTC 


GCTCCGCAGC 


660 


CGACGCAGCA 


GCCCACGCAA 


CAACCAACCC 


AACAGATGCC 


AAC C C AG C AG 


CAGACCGTGG 


720 


CCCCGCAGAC 


GGTGGCGCCG 


GCTCCGCAGC 


CGCCGTCCGG 


TGGCCGCAAC 


GGCAGCGGCG 


780 


GGGGCGACTT 


ATTCGGCGGG 


TTCTGATCAC 


GGTCGCGGCT 


TCACTACGGT 


CGGAGGACAT 


840 


GGCCGGTGAT 


GCGGTGACGG 


TGGTGCTGCC 


CTGTCTCAAC 


GA 




882 



(2) INFORMATION FOR SEQ" ID NO: 13 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 815 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
{ D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 9: 
CCATCAACCA ACCGCTCGCG CCGCCCGCGC CGCCGGATCC GCCGTCGCCG CCACGCCCGC 
CGGTGCCTCC GGTGCCCCCG TTGCCGCCGT CGCCGCCGTC GCCGCCGACC GGCTGGGTGC 
CTAGGGCGCT GTTACCGCCC TGGTTGGCGG GGACGCCGCC GGCACCACCG GTACCGCCGA 
TGGCGCCGTT GCCGCCGGCG GCACCGTTGC CACCGTTGCC ACCGTTGCCA CCGTTGCCGA 
CCAGCCACCC GCCGCGACCA CCGGCACCGC CGGCGCCGCC CGCACCGCCG GCGTGCCCGT 
TCGTGCCCGT ACCGCCGGCA CCGCCGTTGC CGCCGTCACC GCCGACGGAA CTACCGGCGG 
ACGCGGCCTG CCCGCCGGCG CCGCCCGCAC CGCCATTGGC ACCGCCGTCA CCGCCGGCTG 
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GGAGTGCCGC GATTAGGGCA CTGACCGGCG CAACCAGCGC AAGTACTCTC GGTCACCGAG 
CACTTCCAGA CGACACCACA GCACGGGGTT GTCGGCGGAC TGGGTGAAAT GGCAGCCGAT 
AGCGGCTAGC TGTCGGCTGC GGTCAACCTC GATCATGATG TCGAGGTGAC CGTGACCGCG 
CCCCCCGAAG GAGGCGCTGA ACTCGGCGTT GAGCCGATCG GCGATCGGTT GGGGCAGTGC 
CCAGGCCAAT ACGGGGATAC CGGGTGTCNA AGCCGCCGCG AGCGCAGCTT CGGTTGCGCG 
ACNGTGGTCG GGGTGGCCTG TTACGCCGTT GTCNTCGAAC ACGAGTAGCA GGTCTGCTCC 
GGCGAGGGCA TCCACCACGC GTTGCGTCAG CTCGT 
(2) INFORMATION FOR SEQ ID NO: 14 0: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1152 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 0: 
ACCAGCCGCC GGCTGAGGTC TCAGATCAGA GAGTCTCCGG ACTCACCGGG GCGGTTCAGC 
CTTCTCCCAG AACAACTGCT GAAGATCCTC GCCCGCGAAA CAGGCGCTGA TTTGACGCTC 
TATGACCGGT TGAACGACGA GATCATCCGG CAGATTGATA TGGCACCGCT GGGCTAACAG 
GTGCGCAAGA TGGTGCAGCT GTATGTCTCG GACTCCGTGT CGCGGATCAG CTTTGCCGAC 
GGCCGGGTGA TCGTGTGGAG CGAGGAGCTC GGCGAGAGCC AGTATCCGAT CGAGACGCTG 
GACGGCATCA CGCTGTTTGG GCGGCCGACG ATGACAACGC CCTTCATCGT TGAGATGCTC 
AAGCGTGAGC GCGACATCCA GCTCTTCACG ACCGACGGCC ACTACCAGGG CCGGATCTCA 
ACACCCGACG TGTCATACGC GCCGCGGCTC CGTCAGCAAG TTCACCGCAC CGACGATCCT 
GCGTTCTGCC TGTCGTTAAG CAAGCGGATC GTGTCGAGGA AGATCCTGAA TCAGCAGGCC 
TTGATTCGGG CACACACGTC GGGGCAAGAC GTTGCTGAGA GCATCCGCAC GATGAAGCAC 
TCGCTGGCCT GGGTCGATCG ATCGGGCTCC CTGGCGGAGT TGAACGGGTT CGAGGGAAAT 
GCCGCAAAGG CATACTTCAC CGCGCTGGGG CATCTCGTCC CGCAGGAGTT CGCATTCCAG 
GGCCGCTCGA CTCGGCCGCC GTTGGACGCC TTCAACTCGA TGGTCAGCCT CGGCTATTCG 
CTGCTGTACA AGAACATCAT AGGGGCGATC GAGCGTCACA GCCTGAACGC GTATATCGGT 



480 
540 
600 
660 
720 
780 
815 



60 
120 
180 
240 

360 
420 
480 
540 
600 
660 
720 
780 
840 



WO 99/42076 PCT/US99/03268 

143 



900 
960 
1020 
1080 
1140 
1152 



TTCCTACACC AGGATTCACG AGGGCACGCA ACGTCTCGTG CCGAATTCGG CACGAGCTCC 
GCTGAAACCG CTGGCCGGCT GCTCAGTGCC CGTACGTAAT CCGCTGCGCC CAGGCCGGCC 
CGCCGGCCGA ATACCAGCAG ATCGGACAGC GAATTGCCGC CCAGCCGGTT GGAGCCGTGC 
ATACCGCCGG CACACTCACC GGCAGCGAAC AGGCCTGGCA CCGTGGCGGC GCCGGTGTCC 
GCGTCTACTT CGACACCGCC CATCACGTAG TGACACGTCG GCCCGACTTC CATTGCCTGC 
GTTCGGCACG AG 
(2) INFORMATION FOR SEQ ID NO: 141: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 655 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: 

CTCGTGCCGA TTCGGCAGGG TGTACTTGCC GGTGGTGTAN GCCGCATGAG TGCCGACGAC 60 

CAGCAATGCG GCAACAGCAC GGATCCCGGT CAACGACGCC ACCCGGTCCA CGTGGGCGAT 120 

CCGCTCGAGT CCGCCCTGGG CGGCTCTTTC CTTGGGCAGG GTCATCCGAC GTGTTTCCGC 180 

CGTGGTTTGC CGCCATTATG CCGGCGCGCC GCGTCGGGCG GCCGGTATGG CCGAANGTCG 24 0 

ATCAGCACAC CCGAGATACG GGTCTGTGCA AGCTTTTTGA GCGTCGCGCG GGGCAGCTTC 3 00 

GCCGGCAATT CTACTAGCGA GAAGTCTGGC CCGATACGGA TCTGACCGAA GTCGCTGCGG 360 

TGCAGCCCAC CCTCATTGGC GATGGCGCCG ACGATGGCGC CTGGACCGAT CTTGTGCCGC 420 

TTGCCGACGG CGACGCGGTA GGTGGTCAAG TCCGGTCTAC GCTTGGGCCT TTGCGGACGG 480 

TCCCGACGCT GGTCGCGGTT GCGCCGCGAA AGCGGCGGGT CGGGTGCCAT CAGGAATGCC 54 0 

TCACCGCCGC GGCACTGCAC GGCCAGTGCC GCGGCGATGT CAGCCATCGG GACATCATGC 6 00 

TCGCGTTCAT ACTCCTCGAC CAGTCGGCGG AACAGCTCGA TTCCCGGACC GCCCA 655 
(2) INFORMATION FOR SEQ ID NO: 142: 

(i) SEQUENCE CHARACTERISTICS": 

(A) LENGTH: 267 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 

Asn Ala Val Val Ala Phe Ala Val lie Gly Phe Ala Ser Leu Ala Val 
1 5 10 15 

Ala Val Ala Val Thr He Arg Pro Thr Ala Ala Ser Lys Pro Val Glu 
20 25 30 

Gly His Gin Asn Ala Gin Pro Gly Lys Phe Met Pro Leu Leu Pro Thr 
35 40 45 

Gin Gin Gin Ala Pro Val Pro Pro Pro Pro Pro Asp Asp Pro Thr Ala 
50 55 60 

Gly Phe Gin Gly Gly Thr He Pro Ala Val Gin Asn Val Val Pro Arg 
65 7 0 75 so 

Pro Gly Thr Ser Pro Gly Val Gly Gly Thr Pro Ala Ser Pro Ala Pro 
85 90 95 

Glu Ala Pro Ala Val Pro Gly Val Val Pro Ala Pro Val Pro He Pro 
100 105 no 

Val Pro He He lie Pro Pro Phe Pro Gly Trp Gin Pro Gly Met Pro 
115 120 125 

Thr He Pro Thr Ala Pro Pro Thr Thr Pro Val Thr Thr Ser Ala Thr 
130 135 140 

Thr Pro Pro Thr Thr Pro Pro Thr Thr Pro Val Thr Thr Pro Pro Thr 
145 150 155 160 

Thr Pro Pro Thr Thr Pro Val Thr Thr Pro Pro Thr Thr Pro Pro Thr 
165 170 175 

Thr Pro Val Thr Thr Pro Pro Thr Thr Val Ala Pro Thr Thr Val Ala 
180 185 190 

Pro Thr Thr Val Ala Pro Thr Thr Val Ala Pro Thr Thr Val Ala Pro 
195 200 205 

Ala Thr Ala Thr Pro Thr Thr Val Ala Pro Gin Pro Thr Gin Gin Pro 
210 215 220 

Thr Gin Gin Pro Thr Gin Gin Met Pro Thr Gin Gin Gin Thr Val Ala 
225 230 235 240 

Pro Gin Thr Val Ala Pro Ala Pro Gin Pro Pro Ser Gly Gly Arg Asn 
24 5 250 255 

Gly Ser Gly Gly Gly Asp Leu Phe Gly Gly Phe 
260 265 



(2) INFORMATION FOR SEQ ID NO: 143: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 174 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:143: 

lie Asn Gin Pro Leu Ala Pro Pro Ala Pro Pro Asp Pro Pro Ser Pro 
15 10 15 

Pro Arg Pro Pro Val Pro Pro Val Pro Pro Leu Pro Pro Ser Pro Pro 
20 25 30 

Ser Pro Pro Thr Gly Trp Val Pro Arg Ala Leu Leu Pro Pro Trp Leu 
35 40 45 

Ala Gly Thr Pro Pro Ala Pro Pro Val Pro Pro Met Ala Pro Leu Pro 
50 55 60 

Pro Ala Ala Pro Leu Pro Pro Leu Pro Pro Leu Pro Pro Leu Pro Thr 
65 70 75 80 

Ser His Pro Pro Arg Pro Pro Ala Pro Pro Ala Pro Pro Ala Pro Pro 
85 90 95 

Ala Cys Pro Phe Val Pro Val Pro Pro Ala Pro Pro Leu Pro Pro Ser 
100 105 110 

Pro Pro Thr Glu Leu Pro Ala Asp Ala Ala Cys Pro Pro Ala Pro Pro 
115 120 125 

Ala Pro Pro Leu Ala Pro Pro Ser Pro Pro Ala Gly Ser Ala Ala lie 
130 135 140 

Arg Ala Leu Thr Gly Ala Thr Ser Ala Ser Thr Leu Gly His Arg Ala 
145 150 155 160 

Leu Pro Asp Asp Thr Thr Ala Arg Gly Cys Arg Arg Thr Gly 
165 170 

(2) INFORMATION FOR SEQ ID NO: 144: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 5 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS : single 
( D ) TOPOLOGY : 1 mear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144: 
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Gin Pro Pro Ala Glu Val Ser Asp Gin Arg Val Ser Gly Leu Thr Gly 
15 10 15 

Ala Val Gin Pro Ser Pro Arg Thr Thr Ala Glu Asp Pro Arg Pro Arg 
20 25 30 

Asn Arg Arg 

35 

(2) INFORMATION FOR SEQ ID NO: 14 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 104 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 5: 

Arg Ala Asp Ser Ala Gly Cys Thr Cys Arg Trp Cys Xaa Pro His Glu 
15 10 15 

Cys Arg Arg Pro Ala Met Arg Gin Gin His Gly Ser Arg Ser Thr Thr 
20 25 30 

Pro Pro Gly Pro Arg Gly Arg Ser Ala Arg Val Arg Pro Gly Arg Leu 
35 40 45 

Phe Pro Trp Ala Gly Ser Ser Asp Val Phe Pro Pro Trp Phe Ala Ala 
50 55 60 

lie Met Pro Ala Arg Arg Val Gly Arg Pro Val Trp Pro Xaa Val Asp 
65 70 75 80 

Gin His Thr Arg Asp Thr Gly Leu Cys Lys Leu Phe Glu Arg Arg Ala 
85 90 95 

Gly Gin Leu Arg Arg Gin Phe Tyr 
100 

(2) INFORMATION FOR SEQ ID NO: 146: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : other nucleic acid 

(A) DESCRIPTION: /desc = "PCR primer" 



(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: Mycobacterium tuberculosis 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 6: 
GGATCCATAT GGGCCATCAT CATCATCATC ACGTGATCGA CATCATCGGG ACC 5 3 

(2) INFORMATION FOR SEQ ID NO: 147: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = r, PCR Primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147: 

CCTGAATTCA GGC CTCGGTT GCGCCGGCCT CATCTTGAAC GA 42 

(2) INFORMATION FOR SEQ ID NO: 148: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "PCR Primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 148: 

GGATCCTGCA GGCTCGAAAC CACCGAGCGG T 31 

(2) INFORMATION FOR SEQ ID NO: 14 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = " PCR primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 149: 



CTCTGAATTC AGCGCTGGAA ATCGTCGCGA T 
(2) INFORMATION FOR SEQ ID NO:150: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "PGR primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 150: 



GGATCCAGCG CTGAGATGAA GACCGATGCC GCT 
(2) INFORMATION FOR SEQ ID NO: 151: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "PCR primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: 

GAGAGAATTC TCAGAAGCCC ATTTGCGAGG ACA 

(2) INFORMATION FOR SEQ ID NO: 152: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1993 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 



(ix) FEATURE: 
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(A) NAME/ KEY : CDS 

(B) LOCATION: 152.. 1273 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 152: 

TGTTCTTCGA CGGCAGGCTG GTGGAGGAAG GGCCCACCGA ACAGCTGTTC TCCTCGCCGA 6 0 

AGCATGCGGA AACCGCCCGA TACGTCGCCG GACTGTCGGG GGACGTCAAG GACGCCAAGC 12 0 

GCGGAAATTG AAGAGCACAG AAAGGTATGG C GTG AAA ATT CGT TTG CAT ACG 172 

Val Lys lie Arg Leu His Thr 
1 5 

CTG TTG GCC GTG TTG ACC GCT GCG CCG CTG CTG CTA GCA GCG GCG GGC 22 0 

Leu Leu Ala Val Leu Thr Ala Ala Pro Leu Leu Leu Ala Ala Ala Gly 
10 15 20 

TGT GGC TCG AAA CCA CCG AGC GGT TCG CCT GAA ACG GGC GCC GGC GCC 268 
Cys Gly Ser Lys Pro Pro Ser Gly Ser Pro Glu Thr Gly Ala Gly Ala 
25 30 35 

GGT ACT GTC GCG ACT ACC CCC GCG TCG TCG CCG GTG ACG TTG GCG GAG 316 
Gly Thr Val Ala Thr Thr Pro Ala Ser Ser Pro Val Thr Leu Ala Glu 
40 45 50 55 

ACC GGT AGC ACG CTG CTC TAC CCG CTG TTC AAC CTG TGG GGT CCG GCC 364 
Thr Gly Ser Thr Leu Leu Tyr Pro Leu Phe Asn Leu Trp Gly Pro Ala 
60 65 70 

TTT CAC GAG AGG TAT CCG AAC GTC ACG ATC ACC GCT CAG GGC ACC GGT 412 
Phe His Glu Arg Tyr Pro- Asn Val Thr He Thr Ala Gin Gly Thr Gly 
75 80 85 

TCT GGT GCC GGG ATC GCG CAG GCC GCC GCC GGG ACG GTC AAC ATT GGG 4 60 

Ser Gly Ala Gly He Ala Gin Ala Ala Ala Gly Thr Val Asn He Gly 
90 95 100 

GCC TCC GAC GCC TAT CTG TCG GAA GGT GAT ATG GCC GCG CAC AAG GGG 508 
Ala Ser Asp Ala Tyr Leu Ser Glu Gly Asp Met Ala Ala His Lys Gly 
105 110 115 

CTG ATG AAC ATC GCG CTA GCC ATC TCC GCT CAG CAG GTC AAC TAC AAC 556 
Leu Met Asn He Ala Leu Ala He Ser Ala Gin Gin Val Asn Tyr Asn 
120 125 130 135 

CTG CCC GGA GTG AGC GAG CAC CTC AAG CTG AAC GGA AAA GTC CTG GCG 604 
Leu Pro Gly Val Ser Glu His Leu Lys Leu Asn Gly Lys Val Leu Ala 
140 145 150 

GCC ATG TAC CAG GGC ACC ATC AAA ACC TGG GAC GAC CCG CAG ATC GCT 6 52 

Ala Met Tyr Gin Gly Thr He Lys Thr Trp Asp Asp Pro Gin He Ala 
155 160 165 



GCG CTC AAC CCC GGC GTG AAC CTG CCC GGC ACC GCG GTA GTT CCG CTG 
Ala Leu Asn Pro Gly Val Asn Leu Pro Gly Thr Ala Val Val Pro Leu 



700 
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170 175 180 

CAC CGC TCC GAC GGG TCC GGT GAC ACC TTC TTG TTC ACC CAG TAC CTG 74 8 

His Arg Ser Asp Gly Ser Gly Asp Thr Phe Leu Phe Thr Gin Tyr Leu 
185 190 195 

TCC AAG CAA GAT CCC GAG GGC TGG GGC AAG TCG CCC GGC TTC GGC ACC 7 96 

Ser Lys Gin Asp Pro Glu Gly Trp Gly Lys Ser Pro Gly Phe Gly Thr 
200 205 210 215 

ACC GTC GAC TTC CCG GCG GTG CCG GGT GCG CTG GGT GAG AAC GGC AAC 844 
Thr Val Asp Phe Pro Ala Val Pro Gly Ala Leu Gly Glu Asn Gly Asn 
220 225 230 

GGC GGC ATG GTG ACC GGT TGC GCC GAG ACA CCG GGC TGC GTG GCC TAT 3 92 

Gly Gly Met Val Thr Gly Cys Ala Glu Thr Pro Gly Cys Val Ala Tyr 
235 240 245 

ATC GGC ATC AGC TTC CTC GAC CAG GCC AGT CAA CGG GGA CTC GGC GAG 94 0 

lie Gly lie Ser Phe Leu Asp Gin Ala Ser Gin Arg Gly Leu Gly Glu 
250 255 260 

GCC CAA CTA GGC AAT AGC TCT GGC AAT TTC TTG TTG CCC GAC GCG CAA 98 8 

Ala Gin Leu Gly Asn Ser Ser Gly Asn Phe Leu Leu Pro Asp Ala Gin 
265 270 275 

AGC ATT CAG GCC GCG GCG GCT GGC TTC GCA TCG AAA ACC CCG GCG AAC 103 6 

Ser lie Gin Ala Ala Ala Ala Gly Phe Ala Ser Lys Thr Pro Ala Asn 
280 285 290 295 

CAG GCG ATT TCG ATG ATC GAC GGG CCC GCC CCG GAC GGC TAC CCG ATC 1084 
Gin Ala lie Ser Met lie Asp Gly Pro Ala Pro Asp Gly Tyr Pro lie 
300 305 310 

ATC AAC TAC GAG TAC GCC ATC GTC AAC AAC CGG CAA AAG GAC GCC GCC 113 2 

lie Asn Tyr Glu Tyr Ala lie Val Asn Asn Arg Gin Lys Asp Ala Ala 
315 320 325 

ACC GCG CAG ACC TTG CAG GCA TTT CTG CAC TGG GCG ATC ACC GAC GGC 1180 
Thr Ala Gin Thr Leu Gin Ala Phe Leu His Trp Ala lie Thr Asp Gly 
330 335 340 

AAC AAG GCC TCG TTC CTC GAC CAG GTT CAT TTC CAG CCG CTG CCG CCC 12 2 8 

Asn Lys Ala Ser Phe Leu Asp Gin Val His Phe Gin Pro Leu Pro Pro 
345 350 355 

GCG GTG GTG AAG TTG TCT GAC GCG TTG ATC GCG ACG ATT TCC AGC 12 73 
Ala Val Val Lys Leu Ser Asp Ala Leu He Ala Thr He Ser Ser 
360 365 ' 370 

TAGCCTCGTT GACCACCACG CGACAGCAAC CTCCGTCGGG C CATC GGG CT GCTTTGCGGA 13 3 3 

GCATGCTGGC CCGTGCCGGT GAAGTCGGCC GCGCTGGCCC GGC CATC CGG TGGTTGGGTG 13 93 



GGATAGGTGC GGTGATCCCG CTGCTTGCGC TGGTCTTGGT GCTGGTGGTG CTGGTCATCG 14 5 3 
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AGGCGATGGG TGCGATCAGG CTCAACGGGT TGCATTTCTT CACCGCCACC GAATGGAATC 1513 

CAGGCAACAC CTACGGCGAA ACCGTTGTCA CCGACGCGTC GCCCATCCGG TCGGCGCCTA 1573 

CTACGGGGCG TTGCCGCTGA TCGTCGGGAC GCTGGCGACC TCGGCAATCG CCCTGATCAT 163 3 

CGCGGTGCCG GTCTCTGTAG GAGCGGCGCT GGTGATCGTG GAACGGCTGC CGAAACGGTT 16 93 

GGCCGAGGCT GTGGGAATAG TCCTGGAATT GCTCGCCGGA ATCCCCAGCG TGGTCGTCGG 1753 

TTTGTGGGGG GCAATGACGT TCGGGCCGTT CATCGCTCAT CACATCGCTC CGGTGATCGC 1813 

TCACAACGCT CCCGATGTGC CGGTGCTGAA CTACTTGCGC GGCGACCCGG GCAACGGGGA 1873 

GGGCATGTTG GTGTCCGGTC TGGTGTTGGC GGTGATGGTC GTTCCCATTA TCGCCACCAC 1933 

CACTCATGAC CTGTTCCGGC AGGTGCCGGT GTTGCCCCGG GAGGGCGCGA TCGGGAATTC 1993 
(2) INFORMATION FOR SEQ ID NO:153: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 74 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153: 

Val Lys lie Arg Leu His Thr Leu Leu Ala Val Leu Thr Ala Ala Pro 
15 10 15 

Leu Leu Leu Ala Ala Ala Gly Cys Gly Ser Lys Pro Pro Ser Gly Ser 
20 25 30 

Pro Glu Thr Gly Ala Gly Ala Gly Thr Val Ala Thr Thr Pro Ala Ser 
35 40 45 

Ser Pro Val Thr Leu Ala Glu Thr Gly Ser Thr Leu Leu Tyr Pro Leu 
50 55 60 

Phe Asn Leu Trp Gly Pro Ala Phe His Glu Arg Tyr Pro Asn Val Thr 
65 70 75 80 

He Thr Ala Gin Gly Thr Gly Ser Gly Ala Gly He Ala Gin Ala Ala 
85 90 95 

Ala Gly Thr Val Asn He Gly Ala Ser Asp Ala Tyr Leu Ser Glu Gly 
100 105 110 

Asp Met Ala Ala His Lys Gly Leu Met Asn He Ala Leu Ala He Ser 
H5 120 125 

Ala Gin Gin Val Asn Tyr Asn Leu Pro Gly Val Ser Glu His Leu Lys 
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130 135 140 

Leu Asn Gly Lys Val Leu Ala Ala Met Tyr Gin Gly Thr lie Lys Thr 
145 150 155 160 

Trp Asp Asp Pro Gin lie Ala Ala Leu Asn Pro Gly Val Asn Leu Pro 
165 170 175 

Gly Thr Ala Val Val Pro Leu His Arg Ser Asp Gly Ser Gly Asp Thr 
180 185 190 

Phe Leu Phe Thr Gin Tyr Leu Ser Lys Gin Asp Pro Glu Gly Trp Gly 
195 200 205 

Lys Ser Pro Gly Phe Gly Thr Thr Val Asp Phe Pro Ala Val Pro Gly 
210 215 220 

Ala Leu Gly Glu Asn Gly Asn Gly Gly Met Val Thr Gly Cys Ala Glu 
225 230 235 240 

Thr Pro Gly Cys Val Ala Tyr He Gly He Ser Phe Leu Asp Gin Ala 
245 250 255 

Ser Gin Arg Gly Leu Gly Glu Ala Gin Leu Gly Asn Ser Ser Gly Asn 
260 265 270 

Phe Leu Leu Pro Asp Ala Gin Ser He Gin Ala Ala Ala Ala Gly Phe 
275 280 285 

Ala Ser Lys Thr Pro Ala Asn Gin Ala He Ser Met He Asp Gly Pro 
290 295 300 

Ala Pro Asp Gly Tyr Pro He He Asn Tyr Glu Tyr Ala He Val Asn 
305 310 315 320 

Asn Arg Gin Lys Asp Ala Ala Thr Ala Gin Thr Leu Gin Ala Phe Leu 
325 330 335 

His Trp Ala He Thr Asp Gly Asn Lys Ala Ser Phe Leu Asp Gin Val 
340 345 350 

His Phe Gin Pro Leu Pro Pro Ala Val Val Lys Leu Ser Asp Ala Leu 
355 360 365 

lie Ala Thr He Ser Ser 
370 

(2) INFORMATION FOR SEQ ID NO: 154: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1993 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154; 

TGTTCTTCGA CGGCAGGCTG GTGGAGGAAG GGCCCACCGA ACAGCTGTTC TCCTCGCCGA 60 

AGCATGCGGA AACCGCCCGA TACGTCGCCG GACTGTCGGG GGACGTCAAG GACGCCAAGC 12 0 

GCGGAAATTG AAGAGCACAG AAAGGTATGG CGTGAAAATT CGTTTGCATA CGCTGTTGGC 18 0 

CGTGTTGACC GCTGCGCCGC TGCTGCTAGC AGCGGCGGGC TGTGGCTCGA AACCACCGAG 24 0 

CGGTTCGCCT GAAACGGGCG CCGGCGCCGG TACTGTCGCG ACTACCCCCG CGTCGTCGCC 3 00 

GGTGACGTTG GCGGAGACCG GTAGCACGCT GCTCTACCCG CTGTTCAACC TGTGGGGTCC 3 60 

GGCCTTTCAC GAGAGGTATC CGAACGTCAC GATCACCGCT CAGGGCACCG GTTCTGGTGC 42 0 

CGGGATCGCG CAGGCCGCCG CCGGGACGGT CAACATTGGG GCCTCCGACG CCTATCTGTC 480 

GGAAGGTGAT ATGGCCGCGC ACAAGGGGCT G ATGAAC AT C GCGCTAGCCA TCTCCGCTCA 54 0 

GCAGGTCAAC TACAACCTGC CCGGAGTGAG CGAGCACCTC AAGCTGAACG GAAAAGTCCT 6 00 

GGCGGCCATG TACCAGGGCA CCATCAAAAC CTGGGACGAC CCGCAGATCG CTGCGCTCAA 66 0 

CCCCGGCGTG AACCTGCCCG GCACCGCGGT AGTTCCGCTG CACCGCTCCG ACGGGTCCGG 72 0 

TGACACCTTC TTGTTCACCC AGTACCTGTC CAAGCAAGAT CCCGAGGGCT GGGGCAAGTC 780 

GCCCGGCTTC GGCACCACCG TCGACTTCCC GGCGGTGCCG GGTGCGCTGG GTGAGAACGG 840 

CAACGGCGGC ATGGTGACCG GTTGCGCCGA GACACCGGGC TGCGTGGCCT ATATCGGCAT 900 

CAGCTTCCTC GACCAGGCCA GTCAACGGGG ACTCGGCGAG GCCCAACTAG GCAATAGCTC 96 0 

TGGCAATTTC TTGTTGCCCG ACGCGCAAAG CATTCAGGCC GCGGCGGCTG GCTTCGCATC 102 0 

GAAAACCCCG GCGAACCAGG CGATTTCGAT GATCGACGGG CCCGCCCCGG ACGGCTACCC 108 0 

GATCATCAAC TACGAGTACG CCATCGTCAA CAACCGGCAA AAGGACGCCG CCACCGCGCA 114 0 

GACCTTGCAG GCATTTCTGC ACTGGGCGAT CACCGACGGC AACAAGGCCT CGTTCCTCGA 12 00 

CCAGGTTCAT TTCCAGCCGC TGCCGCCCGC GGTGGTGAAG TTGTCTGACG CGTTGATCGC 12 6 0 

GACGATTTCC AGCTAGCCTC GTTGACCACC ACGCGACAGC AACCTCCGTC GGGCCATCGG 13 2 0 

GCTGCTTTGC GGAGCATGCT GGCCCGTGCC GGTGAAGTCG GCCGCGCTGG CCCGGCCATC 13 80 

CGGTGGTTGG GTGGGATAGG TGCGGTGATC CCGCTGCTTG CGCTGGTCTT GGTGCTGGTG 14 4 0 

GTGCTGGTCA TCGAGGCGAT GGGTGCGATC AGGCTCAACG GGTTGCATTT CTTCACCGCC 15 0 0 

ACCGAATGGA ATCCAGGCAA CACCTACGGC GAAACCGTTG TCACCGACGC GTCGCCCATC 156 0 

CGGTCGGCGC CTACTACGGG GCGTTGCCGC TGATCGTCGG GACGCTGGCG ACCTCGGCAA 16 2 0 
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TCGCCCTGAT CATCGCGGTG CCGGTCTCTG TAGGAGCGGC GCTGGTGATC GTGGAACGGC 16 8 0 

TGCCGAAACG GTTGGCCGAG GCTGTGGGAA TAGTCCTGGA ATTGCTCGCC GGAATCCCCA 174 0 

GCGTGGTCGT CGGTTTGTGG GGGGCAATGA CGTTCGGGCC GTTCATCGCT CATCACATCG 18 00 

CTCCGGTGAT CGCTCACAAC GCTCCCGATG TGCCGGTGCT GAACTACTTG CGCGGCGACC 186 0 

CGGGCAACGG GGAGGGCATG TTGGTGTCCG GTCTGGTGTT GGCGGTGATG GTCGTTCCCA 192 0 

TTATCGCCAC CACCACTCAT GACCTGTTCC GGCAGGTGCC GGTGTTGCCC CGGGAGGGCG 198 0 

CGATCGGGAA TTC 1993 
(2) INFORMATION FOR SEQ ID NO: 15 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 74 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 155: 

Met Lys lie Arg Leu His Thr Leu Leu Ala Val Leu Thr Ala Ala Pro 
15 10 15 

Leu Leu Leu Ala Ala Ala Gly Cys Gly Ser Lys Pro Pro Ser Gly Ser 
20 25 30 

Pro Glu Thr Gly Ala Gly Ala Gly Thr Val Ala Thr Thr Pro Ala Ser 
35 40 45 

Ser Pro Val Thr Leu Ala Glu Thr Gly Ser Thr Leu Leu Tyr Pro Leu 
50 55 60 

Phe Asn Leu Trp Gly Pro Ala Phe His Glu Arg Tyr Pro Asn Val Thr 
65 70 75 80 

He Thr Ala Gin Gly Thr Gly Ser Gly Ala Gly He Ala Gin Ala Ala 
85 90 95 

Ala Gly Thr Val Asn He Gly Ala Ser Asp Ala Tyr Leu Ser Glu Gly 
100 105 110 

Asp Met Ala Ala His Lys Gly Leu Met Asn He Ala Leu Ala He Ser 
115 120 125 

Ala Gin Gin Val Asn Tyr Asn Leu Pro Gly Val Ser Glu His Leu Lys 
130 135 140 

Leu Asn Gly Lys Val Leu Ala Ala Met Tyr Gin Gly Thr He Lys Thr 
145 150 155 160 
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Trp Asp Asp Pro Gin He Ala Ala Leu Asn Pro Gly Val Asn Leu Pro 
165 170 175 

Gly Thr Ala Val Val Pro Leu His Arg Ser Asp Gly Ser Gly Asp Thr 
180 185 190 

Phe Leu Phe Thr Gin Tyr Leu Ser Lys Gin Asp Pro Glu Gly Trp Gly 
195 200 205 

Lys Ser Pro Gly Phe Gly Thr Thr Val Asp Phe Pro Ala Val Pro Gly 
210 215 220 

Ala Leu Gly Glu Asn Gly Asn Gly Gly Met Val Thr Gly Cys Ala Glu 
225 230 235 240 

Thr Pro Gly Cys Val Ala Tyr lie Gly He Ser Phe Leu Asp Gin Ala 
245 250 255 

Ser Gin Arg Gly Leu Gly Glu Ala Gin Leu Gly Asn Ser Ser Gly Asn 
260 265 270 

Phe Leu Leu Pro Asp Ala Gin Ser He Gin Ala Ala Ala Ala Gly Phe 
275 280 285 

Ala Ser Lys Thr Pro Ala Asn Gin Ala He Ser Met He Asp Gly Pro 
290 295 300 

Ala Pro Asp Gly Tyr Pro He He Asn Tyr Glu Tyr Ala He Val Asn 
305 310 315 320 

Asn Arg Gin Lys Asp Ala Ala Thr Ala Gin Thr Leu Gin Ala Phe Leu 
325 330 335 

His Trp Ala lie Thr Asp Gly Asn Lys Ala Ser Phe Leu Asp Gin Val 
340 345 350 

His Phe Gin Pro Leu Pro Pro Ala Val Val Lys Leu Ser Asp Ala Leu 
355 360 365 

He Ala Thr He Ser Ser 
370 

(2) INFORMATION FOR SEQ ID NO:156: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1777 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 156: 
GGTCTTGACC ACCACCTGGG TGTCGAAGTC GGTGCCCGGA TTGAAGTCCA GGTACTCGTG 6 0 

GGTGGGGCGG GCGAAACAAT AGCGACAAGC ATGCGAGCAG CCGCGGTAGC CGTTGACGGT 12 0 
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GTAGCGAAAC GGCAACGCGG CCGCGTTGGG CACCTTGTTC AGCGCTGATT TGCACAACAC 18 0 

CTCGTGGAAG GTGATGCCGT CGAATTGTGG CGCGCGAACG CTGCGGACCA GGCCGATCCG 240 

CTGCAACCCG GCAGCGCCCG TCGTCAACGG GCATCCCGTT CACCGCGACG GCTTGCCGGG 3 00 

CCCAACGCAT ACCATTATTC GAACAACCGT TCTATACTTT GTCAACGCTG GCCGCTACCG 360 

AGCGCCGCAC AGGATGTGAT ATGCCATCTC TGCCCGCACA GACAGGAGCC AGGCCTTATG 420 

ACAGCATTCG GCGTCGAGCC CTACGGGCAG CCGAAGTACC TAGAAATCGC CGGGAAGCGC 4 80 

ATGGCGTATA TCGACGAAGG CAAGGGTGAC GCCATCGTCT TTCAGCACGG CAACCCCACG 54 0 

TCGTCTTACT TGTGGCGCAA CATCATGCCG CACTTGGAAG GGCTGGGCCG GCTGGTGGCC 6 00 

TGCGATCTGA TCGGGATGGG CGCGTCGGAC AAGCTCAGCC CATCGGGACC CGACCGCTAT 66 0 

AGCTATGGCG AGCAACGAGA CTTTTTGTTC GCGCTCTGGG ATGCGCTCGA CCTCGGCGAC 720 

CACGTGGTAC TGGTGCTGCA CGACTGGGGC TCGGCGCTCG GCTTCGACTG GGCTAACCAG 780 

CATCGCGACC GAGTGCAGGG GATCGCGTTC ATGGAAGCGA TCGTCACCCC GATGACGTGG 84 0 

GCGGACTGGC CGCCGGCCGT GCGGGGTGTG TTCCAGGGTT TCCGATCGCC TCAAGGCGAG 900 

CCAATGGCGT TGGAGCACAA CATCTTTGTC GAACGGGTGC TGCCCGGGGC GATCCTGCGA 96 0 

CAGCTCAGCG ACGAGGAAAT GAACCACTAT CGGCGGCCAT TCGTGAACGG CGGCGAGGAC 102 0 

CGTCGCCCCA CGTTGTCGTG GCCACGAAAC CTTCCAATCG ACGGTGAGCC CGCCGAGGTC 1080 

GTCGCGTTGG TCAACGAGTA CCGGAGCTGG CTCGAGGAAA CCGACATGCC GAAACTGTTC 1140 

ATCAACGCCG AGCCCGGCGC GATCATCACC GGCCGCATCC GTGACTATGT CAGGAGCTGG 12 00 

CCCAACCAGA CCGAAATCAC AGTGCCCGGC GTGCATTTCG TTCAGGAGGA CAGCGATGGC 126 0 

GTCGTATCGT GGGCGGGCGC TCGGCAGCAT CGGCGACCTG GGAGCGCTCT CATTTCACGA 1320 

G AC C AAGAAT GTGATTTCCG GCGAAGGCGG CGCCCTGCTT GTCAACTCAT AAGACTTCCT 1380 

GCTCCGGGCA GAGATTCTCA GGGAAAAGGG CACCAATCGC AGCCGCTTCC TTCGCAACGA 144 0 

GGTCGACAAA TATACGTGGC AGGACAAAGG TCTTCCTATT TGCCCAGCGA ATTAGTCGCT 150 0 

GCCTTTCTAT GGGCTCAGTT CGAGGAAGCC GAGCGGATCA CGCGTATCCG ATTGGACCTA 1560 

TGGAACCGGT ATCATGAAAG CTTCGAATCA TTGGAACAGC GGGGGCTCCT GCGCCGTCCG 162 0 

ATCATCCCAC AGGGCTGCTC TCACAACGCC CACATGTACT ACGTGTTACT AGCGCCCAGC 1680 

GCCGATCGGG AGGAGGTGCT GGCGCGTCTG ACGAGCGAAG GTATAGGCGC GGTCTTTCAT 174 0 
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TACGTGCCGC TTCACGATTC GCCGGCCGGG CGTCGCT 1777 
(2) INFORMATION FOR SEQ ID NO: 157: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 4 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

{xi) SEQUENCE DESCRIPTION : SEQ ID NO: 157: 

GAGATTGAAT CGTACCGGTC TCCTTAGCGG CTCCGTCCCG TGAATGCCCA TATCACGCAC 60 

GGCCATGTTC TGGCTGTCGA CCTTCGCCCC ATGCCCGGAC GTTGGTAAAC CCAGGGTTTG 12 0 

ATCAGTAATT CCGGGGGACG GTTGCGGGAA GGCGGCCAGG ATGTGCGTGA GCCGCGGCGC 180 

CGCCGTCGCC CAGGCGACCG CTGGATGCTC AGCCCCGGTG CGGCGACGTA GCCAGCGTTT 24 0 

GGCGCGTGTC GTCCACAGTG GTACTCCGGT GACGACGCGG CGCGGTGCCT GGGTGAAGAC 3 00 

CGTGACCGAC GCCGCCGATT CAGA 324 
(2) INFORMATION FOR SEQ ID NO: 158: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 158: 

GCGGTACCGC CGCGTTGCGC TGGCACGGGA CCTGTACGAC CTGAACCACT TCGCCTCGCG 6 0 

AACGATTGAC GAACCGCTCG TGCGGCGGCT GTGGGTGCTC AAGGTGTGGG GTGATGTCGT 12 0 

CGATGACCGG CGCGGCACCC GGCCACTACG CGTCGAAGAC GTCCTCGCCG CCCGCAGCGA 18 0 

GCACGACTTC CAGCCCGACT CGATCGGCGT GCTGACCCGT CCTGTCGCTA TGGCTGCCTG 24 0 

GGAAGCTCGC GTTCGGAAGC GATTTGCGTT CCTCACTGAC CTCGACGCCG ACGAGCAGCG 3 00 

GTGGGCCGCC TGCGACGAAC GGCACCGCCG CGAAGTGGAG AACGCGCTGG CGGTGCTGCG 360 

GTCCTGATCA ACCTGCCGGC GATCGTGCCG TTCCGCTGGC ACGGTTGCGG CTGGACGCGG 42 0 

CTGAATCGAC TAGATGAGAG CAGTTGGGCA CGAATCCGGC TGTGGTGGTG AGCAAGACAC 4 80 

GAGTACTGTC ATCACTATTG GATGCACTGG ATGACCGGCC TGATTCAGCA GGACCAATGG 54 0 

AACTGCCCGG GGCAAAACGT CTCGGAGATG ATCGGCGTCC CCTCGGAACC CTGCGGTGCT 6 00 

GGCGTCATTC GGACATCGGT CCGGCTCGCG GGATCGTGGT GACGCCAGCG CTGAAGGAGT 66 0 
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GGAGCGCGGC GGTGCACGCG 



CTGCTGGACG GCCGGCAGAC GGTGCTGCTG CGTAAGGGCG 



720 



GGATCGGCGA GAAGCGCTTC 



GAGGTGGCGG CCCACGAGTT CTTGTTGTTC CCGACGGTCG 



780 



CGCACAGCCA CGCCGAGCGG 



GTTCGCCCCG AGCACCGCGA CCTGCTGGGC CCGGCGGCCG 



840 



CCGACAGCAC CGACGAGTGT 



GTGCTACTGC GGGCCGCAGC GAAAGTTGTT GCCGCACTGC 



900 



CGGTTAACCG GCCAGAGGGT 



CTGGACGCCA TCGAGGATCT GCACATCTGG ACCGCCGAGT 



960 



CGGTGCGCGC CGACCGGCTC GACTTTCGGC CCAAGCACAA ACTGGCCGTC TTGGTGGTCT 102 0 

CGGCGATCCC GCTGGCCGAG CCGGTCCGGC TGGCGCGTAG GCCCGAGTAC GGCGGTTGCA 1080 

CCAGCTGGGT GCAGCTGCCG GTGACGCCGA CGTTGGCGGC GCCGGTGCAC GACGAGGCCG 114 0 

CGCTGGCCGA GGTCGCCGCC CGGGTCCGCG AGGCCG^GGG TTGACTGGGC GGCATCGCTT 12 0 0 

GGGTCTGAGC TGTACGCCCA GTCGGCGCTG CGAGTGATCT GCTGTCGGTT CGGTCCCTGC 126 0 

TGGCGTCAAT TGACGGCGCG GGCAACAGCA GCATTGGCGG CGCCATCCTC CGCGCGGCCG 132 0 

GCGCCCACCG CTACAACC 13 3 8 

(2) INFORMATION FOR SEQ ID NO: 159: 

(i) SEQUENCE CHARACTERISTICS : 

{A) LENGTH: 321 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:159: 

CCGGCGGCAC CGGCGGCACC GGCGGTACCG GCGGCAACGG CGCTGACGCC GCTGCTGTGG 6 0 

TGGGCTTCGG CGCGAACGGC GACCCTGGCT TCGCTGGCGG CAAAGGCGGT AACGGCGGAA 12 0 

TAGGTGGGGC CGCGGTGACA GGCGGGGTCG CCGGCGACGG CGGCACCGGC GGCAAAGGTG 18 0 

GCACCGGCGG TGCCGGCGGC GCCGGCAACG ACGCCGGCAG CACCGGCAAT CCCGGCGGTA 24 0 

AGGGCGGCGA CGGCGGGATC GGCGGTGCCG GCGGGGCCGG CGGCGCGGCC GGCACCGGCA 3 00 

ACGGCGGCCA TGCCGGCAAC C 321 
(2) INFORMATION FOR SEQ ID NO: 16 0: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 492 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 
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{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16 0: 
GAAGACCCGG CCCCGCCATA TCGATCGGCT CGCCGACTAC TTTCGCCGAA CGTGCACGCG 60 

GCGGCGTCGG GCTGATCATC ACCGGTGGCT ACGCGCCCAA CCGCACCGGA TGGCTGCTGC 120 

CGTTCGCCTC CGAACTCGTC ACTTCGGCGC AAGCCCGACG GCACCGCCGA ATCACCAGGG 180 

CGGTCCACGA TTCGGGTGCA AAGATCCTGC TGCAAATCCT GCACGCCGGA CGCTACGCCT 240 

ACCACCCACT TGCGGTCAGC GCCTCGCCGA TCAAGGCGCC GATCACCCCG TTTCGTCCGC 3 00 

GAGCACTATC GGCTCGCGGG GTCGAAGCGA CCATCGCGGA TTTCGCCCGC TGCGCGCAGT 3 60 

TGGCCCGCGA TGCCGGCTAC GACGGCGTCG AAATCATGGG CAGCGAAGGG TATCTGCTCA 42 0 

ATCAGTTCCT GGCGCCGCGC ACCAACAAGC GCACCGACTC GTGGGGCGGC ACACCGGCCA 4 80 

ACCGTCGCCG GT 492 
(2) INFORMATION FOR SEQ ID NO: 161: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 536 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 161: 

Phe Ala Gin His Leu Val Glu Gly Asp Ala Val Glu Leu Trp Arg Ala 
1 5 10 15 

Asn Ala Ala Asp Gin Ala Asp Pro Leu Gin Pro Gly Ser Ala Arg Arg 
20 25 30 

Gin Arg Ala Ser Arg Ser Pro Arg Arg Leu Ala Gly Pro Asn Ala Tyr 
35 40 45 

His Tyr Ser Asn Asn Arg Ser He Leu Cys Gin Arg Trp Pro Leu Pro 
50 55 60 

Ser Ala Ala Gin Asp Val He Cys His Leu Cys Pro His Arg Gin Glu 
65 70 75 80 

Pro Gly Leu Met Thr Ala Phe Gly Val Glu Pro Tyr Gly Gin Pro Lys 
85 90 95 

Tyr Leu Glu He Ala Gly Lys Arg Met Ala Tyr He Asp Glu Gly Lys 
100 105 no 

Gly Asp Ala He Val Phe Gin His Gly Asn Pro Thr Ser Ser Tyr Leu 
115 120 125 

Trp Arg Asn He Met Pro His Leu Glu Gly Leu Gly Arg Leu Val Ala 
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130 



135 



140 



Cys Asp Leu lie Gly Met Gly Ala Ser Asp Lys Leu Ser Pro Ser Gly 
145 150 155 



160 



Pro Asp Arg Tyr Ser Tyr Gly Glu Gin Arg Asp Phe Leu Phe Ala Leu 
165 170 175 

Trp Asp Ala Leu Asp Leu Gly Asp His Val Val Leu Val Leu His Asp 
180 185 190 

Trp Gly Ser Ala Leu Gly Phe Asp Trp Ala Asn Gin His Arg Asp Arg 
195 200 205 

Val Gin Gly lie Ala Phe Met Glu Ala lie Val Thr Pro Met Thr Trp 
210 215 220 

Ala Asp Trp Pro Pro Ala Val Arg Gly Val Phe Gin Gly Phe Arg Ser 
225 230 235 240 

Pro Gin Gly Glu Pro Met Ala Leu Glu His Asn He Phe Val Glu Arg 
245 250 255 

Val Leu Pro Gly Ala He Leu Arg Gin Leu Ser Asp Glu Glu Met Asn 
260 265 270 

His Tyr Arg Arg Pro Phe Val Asn Gly Gly Glu Asp Arg Arg Pro Thr 
27 5 280 285 

Leu Ser Trp Pro Arg Asn Leu Pro He Asd Gly Glu Pro Ala Glu Val 

295 300 

Val Ala Leu Val Asn Glu Tyr Arg Ser Trp Leu Glu Glu Thr Asp Met 
305 310 315 320 

Pro Lys Leu Phe He Asn Ala Glu Pro Gly Ala He He Thr Gly Arg 
32 5 330 335 

He Arg Asp Tyr Val Arg Ser Trp Pro Asn Gin Thr Glu He Thr Val 
340 345 350 

Pro Gly Val His Phe Val Gin Glu Asp Ser Asp Gly Val Val Ser Trp 
355 3 6 o 365 

Ala Gly Ala Arg Gin His Arg Arg Pro Gly Ser Ala Leu He Ser Arg 
370 375 380 

Asp Gin Glu Cys Asp Phe Arg Arg Arg Arg Arg Pro Ala Cys Gin Leu 
385 390 395 400 

He Arg Leu Pro Ala Pro Gly Arg Asp Ser Gin Gly Lys Gly His Gin 
405 410 4i5 



Ser Gin Pro Leu Pro Ser Gin Arg Gly Arg Gin He Tyr Val Ala Gly 
420 425 430 



WO 99/42076 



PCT/US99/03268 



161 



Gin Arg Ser Ser Tyr Leu Pro Ser Glu Leu Val Ala Ala Phe Leu Trp 

440 445 

Ala Gin Phe Glu Glu Ala Glu Arg lie Thr Arg lie Arg Leu As. Leu 

455 460 
Trp Asn Arg Tyr His Glu Ser Phe Glu Ser Leu Glu Gin Arg Gly Leu 
470 47 5 480 

Leu Arg Arg Pro lie lie Pro Gin Gly Cys Ser His Asn Ala His Met 

485 49° 495 

Tyr Tyr Val Leu Leu Ala Pro Ser Ala Asp Arg Glu Glu Val Leu Ala 

505 510 
Arg Leu Thr Ser Glu Gly lie G ly Ala Val Phe His Tyr Val Pro Leu 



520 525 



His Asp Ser Pro Ala Gly Arg Arq 

530 535 

(2) INFORMATION FOR SEQ ID NO: 162: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 284 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 152: 

Asn Glu Ser Ala Pro Arg Ser Pro Met Leu Pro Ser Ala Arg Pro Arg 

10 15 

T/r Asp Ala lie Ala Val Leu Leu Asn Glu Met His Ala Gly His Cys 

25 30 
Asp Phe Gly Leu Val Gly Pro Ala Pro ^ Ile ^ ^ ^ 

40 45 

Gly Asp Asp Arg Ala Gly Leu Gly Val Asp Glu Gin Phe Arg His Val 

55 60 

Gly Phe Leu Glu Pro Ala Pro Val Leu Val Asp Gin Arg Asp Asp Leu 

° 7 5 80 

Gly Gly Leu Thr Val Asp Trp Lys Val Ser Trp Pro Arg Gin Arg Gly 

85 90 95 

Ala Thr Val Leu Ala Ala Val His Glu Trp Pro Pro lie Val Val His 

105 110 

Phe Leu val Ala Glu Leu Ser Gin Asp Arg Pro Gly Gin His Pro Phe 



120 125 
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Asp Lys Asp Val Val Leu Gin Arg His Trp Leu Ala Leu Arg Arg Ser 
130 135 140 

Glu Thr Leu Glu His Thr Pro His Gly Arg Arg Pro Val Arg Pro Arg 
145 «o 15S 160 

His Arg Gly Asp Asp Arg Phe His Glu Arg Asp Pro Leu His Ser Val 
16 5 170 175 

Ala Met Leu Val Ser Pro Val Glu Ala Glu Arg Arg Ala Pro Val Val 
180 185 190 

Gin His Gin Tyr His Val Val Ala Glu Val Glu Arg He Pro Glu Arg 
195 200 205 

Glu Gin Lys Val Ser Leu Leu Ala He Ala H e Ala Val Gly Ser Arg 
210 215 220 

Trp Ala Glu Leu Val Arg Arg Ala His Pro Asp Gin lie Ala Gly His 
225 23 ° 235 2 40 

Gin Pro Ala Gin Pro Phe Gin Val Arg His Asp Val Ala Pro Gin Val 
245 250 255 

Arg Arg Arg Gly Val Ala Val Leu Lys Asp Asp Gly Val Thr Leu Ala 
2S0 265 270 

Phe Val Asp lie Arg His Ala Leu Pro Gly Asp Phe 
275 280 

(2) INFORMATION FOR SEQ ID NO: 16 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 264 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:163: 
ATGAACATGT CGTCGGTGGT GGGTCGCAAG GCCTTTGCGC GATTCGCCGG CTACTCCTCC 

GCCATGCACG CGATCGCCGG TTTCTCCGAT GCGTTGCGCC AAGAGCTGCG GGGTAGCGGA 120 

ATCGCCGTCT CGGTGATCCA CCCGGCGCTG ACCCAGACAC CGCTGTTGGC CAACGTCGAC 18 0 

CCCGCCGACA TGCCGCCGCC GTTTCGCAGC CTCACGCCCA TTCCCGTTCA CTGGGTCGCG 24 0 
GCAGCGGTGC TTGACGGTGT GGCG 
(2) INFORMATION FOR SEQ ID NO: 164: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1171 base oairs 



60 



264 
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(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 164: 



TAGTCGGCGA CGATGACGTC GCGGTCCAGG 


CCGACCGCTT 


CAAGCACCAG 


CGCGACCACG 


60 


AAGCCGGTGC GATCCTTACC CGCGAAGCAG 


TGGGTGAGCA 


CCGGGCGTCC 


GGCGGCAAGC 


120 


AGTGTGACGA CACGATGTAG CGCGCGCTGT 


GCTCCATTGC 


GCGTTGGGAA 


TTGGCGATAC 


180 


TCGTCGGTCA TGTAGCGGGT GGCCGCGTCA 


TTTATCGACT 


GGCTGGATTC 


GCCGGACTCG 


240 


CCGTTGGACC CGTCATTGGT TAGCAGCCTC 


TTGAATGCGG 


TTTCGTGCGG 


CGCTGAGTCG 


300 


TCGGCGTCAT CATCGGCGAG GTCGGGGAAC 


GGCAGCAGGT 


GGACGTCGAT 


GCCGTCCGGA 


360 


ACCCGTCCTG GACCGCGGCG GGCAACCTCC 


CGGGACGACC 


GCAGGTCGGC 


AACGTCGGTG 


420 


ATCCCCAGCC GGCGCAGCGT TGCCCCTCGT 


GCCGAATTCG 


GCACGAGGCT 


GGCGAGCCAC 


480 


CGGGCATCAC CAAGCAACGC TTGCCCAGTA 


CGGATCGTCA 


CTTCCGCATC 


CGGCAGACCA 


540 


ATCTCCTCGC CGCCCATCGT CAGATCCCGC 


TCGTGCGTTG 


ACAAGAACGG 


CCGCAGATGT 


600 


GCCAGCGGGT ATCGGAGATT GAACCGCGCA 


CGCAGTTCTT 


CAATCGCTGC 


GCGCTGCCGC 


660 


ACTATTGGCA CTTTCCSGCG GTCGCGGTAT 


TCAGCAAGCA 


TGCGAGTCTC 


GACGAACTCG 


720 


CCCCACGTAA CCCACGGCGT AGCTCCCGGC 


GTGACGCGGA 


GGATCGGCGG 


GTGATCTTTG 


780 


CCGCCACGCT CGTAGCCGTT GATCCACCGC 


TTCGCGGTGC 


CGGCGGGGAG 


GCCGATCAGC 


840 


TTATCGACCT CGGCGTATGC CGACGGCAAG 


CTGGGCGCGT 


TCGTCGAGGT 


CAAGAACTCC 


900 


ACCATCGGCA CCGGCACCAA GGTGCCGCAC 


CTGACCTACG 


TCGGCGACGC 


CGACATCGGC 


960 


GAGTACAGCA ACATCGGCGC CTCCAGCGTG 


TTCGTCAACT 


ACGACGGTAC 


GTCCAAACGG 


1020 


CGCACCACCG TCGGTTCGCA CGTACGGACC 


GGGTCCGACA 


CCATGTTCGT 


GGCCCCAGTA 


1080 


ACCATCGGCG ACGGCGCGTA TACCGGGGCC 


GGCACAGTGG 


TGCGGGAGGA 


TGTCCCGCCG 


1140 


GGGGCGCTGG CAGTGTCGGC GGGTCCGCAA 


C 






1171 


(2) INFORMATION FOR SEQ ID NO: 165 











(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 227 base pairs 
CB) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16 5: 
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GCAAAGGCGG CACCGGCGGG GCCGGCATGA ACAGCCTCGA CCCGCTGCTA GCCGCCCAAG 
ACGGCGGCCA AGGCGGCACC GGCGGCACCG GCGGCAACGC CGGCGCCGGC GGCACCAGCT 
TCACCCAAGG CGCCGACGGC AACGCCGGCA ACGGCGGTGA CGGCGGGGTC GGCGGCAACG 
GCGGAAACGG CGGAAACGGC GCAGACAACA CCACCACCGC CGCCGCC 
(2) INFORMATION FOR SEQ ID NO: 166: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 304 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(xa) SEQUENCE DESCRIPTION: SEQ ID NO: 166: 
CCTCGCCACC ATGGGCGGGC AGGGCGGTAG CGGTGGCGCC GGCTCTACCC CAGGCGCCAA 
GGGCGCCCAC GGCTTCACTC CAACCAGCGG CGCCGACGGC GGCGACGGCG GCAACGGCGG 
CAACTCCCAA GTGGTCGGCG GCAACGGCGG CGACGGCGGC AATGGCGGCA ACGGCGGCAG 
CGCCGGCACG GGCGGCAACG GCGGCCGCGG CGCCGACGGC GCGTTTGGTG GCATGAGTGC 
CAACGCCACC AACCCTGGTG AAAACGGGCC AAACGGTAAC CCCGGCGGCA ACGGTGGCGC 
CGGC 

(2) INFORMATION FOR SEQ ID NO: 16 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 3 9 base pairs 

(B) TYPE: nucleic acid" 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 167: 
GTGGGACGCT GCCGAGGCTG TATAACAAGG ACAACATCGA CCAGCGCCGG CTCGGTGAGC 
TGATCGACCT ATTTAACAGT GCGCGCTTCA GCCGGCAGGG CGAGCACCGC GCCCGGGATC 
TGATGGGTGA GGTCTACGAA TACTTCCTCG GCAATTTCGC TCGCGCGGAA GGGAAGCGGG 
GTGGCGAGTT CTTTACCCCG CCCAGCGTGG TCAAGGTGAT CGTGGAGGTG CTGGAGCCGT 
CGAGTGGGCG GGTGTATGAC CCGTGCTGCG GTTCCGGAGG CATGTTTGTG CAGACCGAGA 
AGTTCATCTA CGAACACGAC GGCGATCCGA AGGATGTCTC GATCTATGGC CAGGAAAGCA 
TTG AG GAG AC CTGGCGGATG GCGAAGATGA ACCTCGCCAT CCACGGCATC GACAACAAGG 
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GGCTCGGCGC CCGATGGAGT GATACCTTCG CCCGCGACCA 


GCACCCGGAC 


GTGCAGATGG 


480 


ACTACGTGAT GGCCAATCCG CCGTTCAACA TCAAAGACTG 


GGCCCGCAAC 


GAGGAAGACC 


540 


CACGCTGGCG CTTCGGTGTT CCGCCCGCCA ATAACGCCAA 


CTACGCATGG 


ATTCAGCACA 


500 


TCCTGTACAA CTTGGCGCCG GGAGGTCGGG CGGGCGTGGT 


GATGGCCAAC 


GGGTCGATGT 


660 


CGTCGAACTC CAACGGCAAG GGGGATATTC GCGCGCAAAT 


CGTGGAGGCG 


GATTTGGTTT 


720 


CCTGCATGGT CGCGTTACCC ACCCAGCTGT TCCGCAGCAC 


CGGAATCCCG 


GTGTGCCTGT 


780 


GGTTTTTCGC CAAAAACAAG GCGGCAGGTA AGCAAGGGTC 


TATCAACCGG 


TGCGGGCAGG 


840 


TGCTGTTCAT CGACGCTCGT GAACTGGGCG ACCTAGTGGA 


CCGGGCCGAG 


CGGGCGCTGA 


900 


CCAACGAGGA GATCGTCCGC ATCGGGGATA CCTTCCACGC 


GAGCACGACC 


ACCGGCAACG 


960 


CCGGCTCCGG TGGTGCCGGC GGTAATGGGG GCACTGGCCT 


CAACGGCGCG 


GGCGGTGCTG 


1020 


GCGGGGCCGG CGGCAACGCG GGTGTCGCCG GCGTGTCCTT 


CGGCAACGCT 


GTGGGCGGCG 


1080 


ACGGCGGCAA CGGCGGCAAC GGCGGCCACG GCGGCGACGG 


CACGACGGGC 


GGCGCCGGCG 


1140 


GCAAGGGCGG CAACGGCAGC AGCGGTGCCG CCAGCGGCTC 


AGGCGTCGTC AACGTCACCG 


1200 


CCGGCCACGG CGGCAACGGC GGCAATGGCG GCAACGGCGG 


CAACGGCTCC 


GCGGGCGCCG 


1260 


GCGGCCAGGG CGGTGCCGGC GGCAGCGCCG GCAACGGCGG 


CCACGGCGGC 


GGTGCCACCG 


1320 


3CGGCGCCAG CGGCAAGGGC GGCAACGGCA CCAGCGGTGC 


CGCCAGCGGC 


TCAGGCGTCA 


1380 


TCAACGTCAC CGCCGGCCAC GGCGGCAACG GCGGCAATGG 


CCGCAACGGC 


GGCAACGGC 


1439 


(2) INFORMATION FOR SEQ ID NO: 168: 









(x) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 329 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 168: 



GGGCCGGCGG 


GGCCGGATTT 


TCTCGTGCCT 


TGATTGTCGC 


TGGGGATAAC 


GGCGGTGATG 


60 


GTGGTAACGG 


CGGGATGGGC 


GGGGCTGGCG 


GGGCTGGCGG 


CCCCGGCGGG 


GCCGGCGGCC 


120 


TGATCAGCCT 


GCTGGGCGGC 


CAAGGCGCCG - 


-GCGGGGCCGG 


CGGGACCGGC 


GGGGCCGGCG 


180 


GTGTTGGCGG 


TGACGGCGGG 


GCCGGCGGCC 


CCGGCAACCA 


GGCCTTCAAC 


GCAGGTGCCG 


240 


GCGGGGCCGG 


CGGCCTGATC 


AGCCTGCTGG 


GCGGCCAAGG 


CGCCGGCGGG 


GCCGGCGGGA 


300 


CCGGCGGGGC 


CGGCGGTGTT 


GGCGGTGAC 








329 
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(2 J INFORMATION FOR SEQ ID NO: 16 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 169: 
GCAACGGTGG CAACGGCGGC ACCAGCACGA CCGTGGGGAT GGCCGGAGGT AACTGTGGTG 
CCGCCGGGCT GATCGGCAAC 
(2) INFORMATION FOR SEQ ID NO: 170: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 392 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 170 : 
GGG CTGTGTC GCACTCACAC CGCCGCATTC GGCGACGTTG GCCGCCCAAT ATCCAGCTCA 
AGGCCTACTA CTTACCGTCG GAGGACCGCC GCATCAAGGT GCGGGTCAGC GCCCAAGGAA 
TCAAGGTCAT CGACCGCGAC GGGCATCGAG GCCGTCGTCG CGCGGCTCGG GCAGGATCCG 
CCCCGGCGCA CTTCGCGCGC CAAGCGGGCT CATCGCTCCG AACGGCGGCG ATCCTGTGAG 
CACAAC7GAT GGCGCGCAAC GAGATTCGTC CAATTGTCAA GCCGTGTTCG ACCGCAGGGA 
CCGGTTATAC GTATGTCAAC CTATGTCACT CGCAAGAACC GGCATAACGA TCCCGTGATC 
CGCCGACAGC CCACGAGTGC AAGACCGTTA CA 
(2) INFORMATION FOR SEQ ID NO: 171: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

{Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 171: 
ACCGGCGCCA CCGGCGGCAC CGGGTTCGCC GGTGGCGCCG GCGGGGCCGG CGGGCAGGGC 
GGTATCAGCG GTGCCGGCGG CACCAACGGC TCTGGTGGCG CTGGCGGCAC CGGCGGACAA 
GGCGGCGCCG GGGGCGCTGG CGGGGCCGGC GCCGATAACC CCACCGGCAT CGGCGGCGCC 
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GGCGGCACCG GCGGCACCGG CGGAGCGGCC GGAGCCGGCG GGGCCGGTGG CGCCATCGGT 24 0 

ACCGGCGGCA CCGGCGGCGC GGTGGGCAGC GTCGGTAACG CCGGGATCGG CGGTACCGGC 3 00 

GGTACGGGTG GTGTCGGTGG TGCTGGTGGT GCAGGTGCGG CTGCGGCCGC TGGCAGCAGC 
GCTACCGGTG GCGCCGGGTT CGCCGGCGGC GCCGGCGGAG AAGGCGGACC GGGCGGCAAC 
AGCGGTGTGG GCGGCACCAA CGGCTCCGGC GGCGCCGGCG GTGCAGGCGG CAAGGGCGGC 480 
ACCGGAGGTG CCGGCGGGTC CGGCGCGGAC AACCCCACCG GTGCTGGTTT CGCCG 
(2) INFORMATION FOR SEQ ID NO: 172: 



CCTCGTCACC TAACGGATTC CCGACGGCAT 
(2) INFORMATION FOR SEQ ID NO: 173: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 407 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
{ D ) TOPOLOGY : 1 inear 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 173 



360 
420 



535 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 690 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 172: 

CCGACGTCGC CGGGGCGATA CGGGGGTCAC CGACTACTAC ATCATCCGCA CCGAGAATCG 60 

GCCGCTGCTG CAACCGCTGC GGGCGGTGCC GGTCATCGGA GATCCGCTGG CCGACCTGAT 120 

CCAGCCGAAC CTGAAGGTGA TCGTCAACCT GGGCTACGGC GACCCGAACT ACGGCTACTC 18 0 

GACGAGCTAC GCCGATGTGC GAACGCCGTT CGGGCTGTGG CCGAACGTGC CGCCTCAGGT 24 0 

CAT CGCCG AT GCCCTGGCCG CCGGAACACA AGAAGGCATC CTTGACTTCA CGGCCGACCT 3 00 

GCAGGCGCTG TCCGCGCAAC CGCTCACGCT CCCGCAGATC CAGCTGCCGC AACCCGCCGA 360 

TCTGGTGGCC GCGGTGGCCG CCGCACCGAC GCCGGCCGAG GTGGTGAACA CGCTCGCCAG 42 0 

GATCATCTCA ACCAACTACG CCGTCCTGCT GCCCACCGTG GACATCGCCC TCGCCTGGTC 480 

ACCACCCTGC CGCTGTACAC CACCCAACTG TTCGTCAGGC AACTCGCTGC GGGCAATCTG 54 0 

ATCAACGCGA TCGGCTATCC CCTGGCGGCC ACCGTAGGTT TAGGCACGAT CGATAGCGGG 600 

CGGCGTGGAA TTGCTCACCC TCCTCGCGGC GGCCTCGGAC ACCGTTCGAA ACATCGAGGG 66 0 



690 
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ACGGTGACGG CGGTACTGGC GGCGGCCACG GCGGCAACGG CGGGAATCCC GGGTGGCTCT 



60 



TGGGCACAGC CGGGGGTGGC GGCAACGGTG GCGCCGGCAG CACCGGTACT GCAGGTGGCG 



120 



GCTCTGGGGG CACCGGCGGC GACGGCGGGA CCGGCGGGCG TGGCGGCCTG TTAATGGGCG 



180 



CCGGCGCCGG CGGGCACGGT GGCACTGGCG GCGCGGGCGG TGCCGGTGTC GACGGTGGCG 



240 



GCGCCGGCGG GGCCGGCGGG GCCGGCGGCA ACGGCGGCGC CGGGGGTCAA GCCGCCCTGC 



300 



TGTTCGGGCG CGGCGGCACC GGCGGAGCCG GCGGCTACGG CGGCGATGGC GGTGGCGGCG 



360 



GTGACGGCTT CGACGGCACG ATGGCCGGCC TGGGTGGTAC CGGTGGC 



407 



(2) INFORMATION FOR SEQ ID NO: 174: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 174: 

GATCGGTCAG CGCATCGCCC TCGGCGGCAA GCGATTCCGC GGTCTCACCG AAGAACATCG 60 

TGCACGCGGC GGCGCGGACC AGCCCGCTGC GCTGCGGCGC GTCGAACGCC TCCAGCAGGC 120 

ACAGCCAGTC CTTGGCGGCC TGCGAGGCGA ACACGTCGGT GTCACCGGTG TAGATCGCCG 180 

GGATGCCCGC CTCCGCCAAC GCATTCCGGC ACGCCCGCGC GTCTTTGTGA TGCTCGACGA 24 0 

TCACCGCGAT GTCTGCGGCC ACCACGGGCC GCCCGGCGAA GGTGGCCCCG CTGGCCAGTA 300 

GCGCCGCGAC GTCGGCGGCC AGGTCGTCGG GGATGTGCCG GCGCAGCGCT CCGGCGCGAC 36 0 

GCCCGAAAAA CGACCCCTCA CCCAGCTGGG TCCCGCTGGC ATATCCCTTG CCGTCCTGGG 42 0 

CGATATTGGA CGCGCATGCC CCGACCGCGT ACAGGCCGGC CACCACCG 468 
(2) INFORMATION FOR SEQ ID NO: 175: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 219 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 175: 
GGTGGTAACG GCGGCCAGGG TGGCATCGGC GGCGCCGGCG AGAGAGGCGC CGACGGCGCC 60 
GGCCCCAATG CTAACGGCGC AAACGGCGAG AACGGCGGTA GCGGTGGTAA CGGTGGCGAC 120 
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GGCGGCGCCG GCGGCAATGG CGGCGCGGGC GGCAACGCGC AGGCGGCCGG GTACACCGAC 180 
GGCGCCACGG GCACCGGCGG CGACGGCGGC AACGGCGGC 219 
(2} INFORMATION FOR SEQ ID NO: 176: 

Ci) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 494 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 176: 

TAGCTCCGGC GAGGGCGGCA AGGGCGGCGA CGGTGGCCAC GGCGGTGACG GCGTCGGCGG 60 

CAACAGTTCC GTCACCCAAG GCGGCAGCGG CGGTGGCGGC GGCGCCGGCG GCGCCGGCGG 120 

CAGCGGCTTT TTCGGCGGCA AGGGCGGCTT CGGCGGCGAC GGCGGTCAGG GCGGCCCCAA 180 

CG'GCGGCGGT ACCGTCGGCA CCGTGGCCGG TGGCGGCGGC AACGGCGGTG TCGGCGGCCG 240 

GGGCGGCGAC GGCGTCTTTG CCGGTGCCGG CGGCCAGGGC GGCCTCGGTG GGCAGGGCGG 300 

CAATGGCGGC GGCTCCACCG GCGGCAACGG CGGCCTTGGC GGCGCGGGCG GTGGCGGAGG 360 

CAACGCCCCG GCTCGTGCCG AATCCGGGCT GACCATGGAC AGCGCGGCCA AGTTCGCTGC 420 

CATCGCATCA GGCGCGTACT GCCCCGAACA CCTGGAACAT CACCCGAGTT AGCGGGGCGC 480 

ATTTCCTGAT CACC 494 
(2) INFORMATION FOR SEQ ID NO: 177: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 220 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 177: 

GGGCCGGTGG TGCCGCGGGC CAGCTCTTCA GCGCCGGAGG CGCGGCGGGT GCCGTTGGGG 60 

TTGGCGGCAC CGGCGGCCAG GGTGGGGCTG GCGGTGCCGG AGCGGCCGGC GCCGACGCCC 120 

CCGCCAGCAC AGGTCTAACC GGTGGTACCG GGTTCGCTGG CGGGGCCGGC GGCGTCGGCG 180 

GCCAGAGCGG CAACGCCATT GCCGGCGGCA TCAACGGCTC 220 
(2) INFORMATION FOR SEQ ID NO: 178: 

(ij SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 388 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 178: 



ATGGCGGCAA 


CGGGGGCCCC 


GGCGGTGCTG 


GCGGGGCCGG 


CGACTACAAT TTCCAACGGC 


60 


GGGCAGGGTG 


GTGCCGGCGG 


CCAAGGCGGC 


CAAGGCGGCC TGGGCGGGGC AAGCACCACC 


120 


TGATCGGCCT 


AGCCGCACCC 


GGGAAAGCCG 


ATCCAACAGG 


CGACGATGCC GCCTTCCTTG 


180 


CCGCGTTGGA 


CCAGGCCGGC 


ATCACCTACG 


CTGACCCAGG 


CCACGCCATA ACGGCCGCCA 


240 


AGGCGATGTG 


TGGGCTGTGT 


GCTAACGGCG 


TAACAGGTCT 


ACAGCTGGTC GCGGACCTGC 


300 


GGGACTACAA 


TCCCGGGCTG 


ACCATGGACA 


GCGCGGCCAA 


GTTCGCTGCC ATCGCATCAG 


360 


GCGCGTACTG 


CCCCGAACAC 


CTGGAACA 






388 


(2) INFORMATION FOR SEQ ID NO: 179 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 400 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 179: 



GCAAAGGCGG 


CACCGGCGGG 


GCCGGCATGA 


ACAGCCTCGA 


CCCGCTGCTA 


GCCGCCCAAG 


60 


ACGGCGGCCA 


AGGCGGCACC 


GGCGGCACCG 


GCGGCAACGC 


CGGCGCCGGC 


GGCACCAGCT 


120 


TCACCCAAGG 


CGCCGACGGC 


AACGCCGGCA 


ACGGCGGTGA 


CGGCGGGGTC 


GGCGGCAACG 


180 


GCGGAAACGG 


CGGAAACGGC 


GCAGACAACA 


CCACCACCGC 


CGCCGCCGGC 


ACCACAGGCG 


240 


GCGACGGCGG 


GGCCGGCGGG 


GCCGGCGGAA 


CCGGCGGAAC 


CGGCGGAGCC 


GCCGGCACCG 


300 


GCACCGGCGG 


CCAACAAGGC 


AACGGCGGCA 


ACGGCGGCAC 


CGGCGGCAAA 


GGCGGCACCG 


360 


GCGGCGACGG 


TGCACTCTCA 


GGCAGCACCG 


GTGGTGCCGG 






400 


(2) INFORMATION FOR SEQ ID NO: 180 











(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 538 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:180: 

kCGGCG GCAACGGCGG CATCGCCGGC ATTGGGCGGC AACGGCGTTC CGGGACGl 
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AGCGGCAACG GCGGCCAACG GCGGCAGCGG CGGCAACGGC GGCAACGCCG GCATGGGCGG 
CAACAGCGGC ACCGGCAGCG GCGACGGCGG TGCCGGCGGG AACGGCGGCG CGGCGGGCAC 
GGGCGGCACC GGCGGCGACG GCGGCCTCAC CGGTACTGGC GGCACCGGCG GCAGCGGTGG 
CACCGGCGGT GACGGCGGTA ACGGCGGCAA CGGAGCAGAT AACACCGCAA ACATGACTGC 
GCAGGCGGGC GGTGACGGTG GCAACGGCGG CGACGGTGGC TTCGGCGGCG GGGCCGGGGC 
CGGCGGCGGT GGCTTGACCG CTGGCGCCAA CGGCACCGGC GGGCAAGGCG GCGCCGGCGG 
CGATGGCGGC AACGGGGCCA TCGGCGGCCA CGGCCCACTC ACTGACGACC CCGGCGGCAA 
CGGGGGCACC GGCGG CAACG GCGGCAGCGG CGGCACCGGC GGCGCGGGCA TCGGCAGC 
(2) INFORMATION FOR SEQ ID NO: 181: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 239 base pairs 

(B) TYPE: nucleic acid 
(O STRANDEDNESS: single 
(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 181: 
GGGCCGGTGG TGCCGCGGGC CAGCTCTTCA GCGCCGGAGG CGCGGCGGGT GCCGTTGGGG 
TTGGCGGCAC CGGCGGCCAG GGTGGGGCTG GCGGTGCCGG AGCGGCCGGC GCCGACGCCC 
CCGCCAGCAC AGGTCTAACC GGTGGTACCG GGTTCGCTGG CGGGGCCGGC GGCGTCGGCG 
GCGACGGCGG CAACGCCATT GCCGGCGGCA TCAACGGCTC CGGTGGTGCC GGCGGCACC 
<2> INFORMATION FOR SEQ ID NO: 182: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 98 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:182: 
AGCAGCGCTA CCGGTGGCGC CGGGTTCGCC GGCGGCGCCG GCGGAGAAGG CGGAGCGGGC 
GGCAACAGCG GTGTGGGCGG CACCAACGGC TCCGGCGGCG CCGGCGGTGC AGGCGGCAAG 
GGCGGCACCG GAGGTGCCGG CGGGTCCGGC ^CGGACAACC CCACCGGTGC TGGTTTCGCC 
GGTGGCGCCG GCGGCACAGG TGGCGCGGCC GGCGCCGGCG GGGCCGGCGG GGCGACCGGT 
ACCGGCGGCA CCGGCGGCGT TGTCGGCGCC ACCGGTAGTG CAGGCATCGG CGGGGCCGGC 
GTGACGGCGG CGATGGGGCC AGCGGTCTCG GCCTGGGCCT CTCCGGCTTT 
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GACGGCGGCC AAGGCGGCCA AGGCGGGGCC 


GGCGGCAGCG 


CCGGCGCCGG 


CGGCATCAAC 


420 


GGGGCCGGCG GGGCCGGCGG CAACGGCGGC 


GACGGCGGGG ACGGCGCAAC 


CGGTGCCGCA 


480 


GGTCTCGGCG ACAACGGCGG GGTCGGCGGT 




CCGGTGGCGC 


CGCCGGCAAC 


540 


GGCGGCAACG CGGGCGTCGG CCTGACAGCC 


AAGGCCGGCG 


ACGGCGGCGC 


CGCGGGCAAT 


600 


GGCGGCAACG GGGGCGCCGG CGGTGCTGGC 


GGGGCCGGCG 


ACAACAATTT 


CAACGGCGGC 


660 


CAGGGTGGTG ccmzrrzr , nr>n * r*^^~~~ 

w*>jwi\juiu ^uioijLLKiCCA AGGCGGCCAA 


GGCGGCTTGG 


GCGGGGCAAG 


CACCACCTGA 


720 


TCGGCCTAGC CGCACCCGGG AAAGCCGATC 


CAACAGGCGA 


CGATGCCGCC 


TTCCTTGCCG 


780 


CGTTGGACCA GGCCGGCATC ACCTACGCTG 


ACCCAGGCCA 


CGCCATAACG 


GCCGCCAAGG 


840 


C3ATGTGTGG GCTGTGTGCT AACGGCGTAA 


CAGGTCTACA 


GCTGGTCGCG 


GACCTGCGGG 


900 


AATACAATCC CGGGCTGACC ATGGACAGCG 


CGGCCAAGTT 


CGCTGCCATC 


GCATCAGGCG 


960 


CGTACTGCCC CGAACACCTG GAACA 








985 


(2) INFORMATION FOR SEQ ID NO: 183 











(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH : 213 8 base pa 
03) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



) SEQUENCE DESCRIPTION: SEQ ID NO : 1 



CGGCACGAGG 


ATCGGTACCC 


CGCGGCATCG 


GCAGCTGCCG 


ATTCGCCGGG 


TTTCCCCACC 


60 


CGAGGAAAGC 


CGCTACCAGA 


TGGCGCTGCC 


GAAGTAGGGC 


GATCCGTTCG 


CGATGCCGGC 


12 0 


ATGAACGGGC 


GGCATCAAAT 


TAGTGCAGGA 


ACCTTTCAGT 


TTAGCGACGA 


TAATGGCTAT 


180 


AGCACTAAGG 


AGGATGATCC 


GATATGACGC 


AGTCGCAGAC 


CGTGACGGTG 


GATCAGCAAG 


240 


AGATTTTGAA 


CAGGGCCAAC 


GAGGTGGAGG 


CCCCGATGGC 


GGACCCACCG 


ACTGATGTCC 


300 


CCATCACACC 


GTGCGAACTC 


ACGGCGGCTA 


AAAACGCCGC 


CCAACAGCTG 


GTATTGTCCG 


360 


CCGACAACAT 


GCGGGAATAC 


CTGGCGGCCG 


GTGCCAAAGA 


GCGGCAGCGT 


CTGGCGACCT 


420 


CGCTGCGCAA 


CGCGGCCAAG 


GCGTATGGCG 


AGGTTGATGA 


GGAGGCTGCG 


ACCGCGCTGG 


480 


ACAACGACGG 


CGAAGGAACT 


GTGCAGGCAG 


AATCGGCCGG 


GGCCGTCGGA 


GGGGACAGTT 


540 


CGG CCGAACT 


AACCGATACG 


CCGAGGGTGG 


CCACGGCCGG 


TGAACCCAAC 


TTCATGGATC 


600 


TCAAAGAAGC 


GGCAAGGAAG 


CTCGAAACGG 


GCGACCAAGG 


CGCATCGCTC 


GCGCACTTTG 


660 
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CGGATGGGTG GAACACTTTC AACCTGACGC TGCAAGGCGA CGTCAAGCGG TTCCGGGGGT 

TTGACAACTG GGAAGGCGAT GCGGCTACCG CTTGCGAGGC TTCGCTCGAT CAACAACGGC 

AATGGATACT CCACATGGCC AAATTGAGCG CTGCGATGGC CAAGCAGGCT CAATATGTCG 

CGCAGCTGCA CGTGTGGGCT AGGCGGGAAC ATCCGACTTA TGAAGACATA GTCGGGCTCG 

AACGGCTTTA CGCGGAAAAC CCTTCGGCCC GCGACCAAAT TCTCCCGGTG TACGCGGAGT 

ATCAGCAGAG GTCGGAGAAG GTGCTGACCG AATACAACAA CAAGGCAGCC CTGGAACCGG 

TAAACCCGCC GAAGCCTCCC CCCGCCATCA AGATCGACCC GCCCCCGCCT CCGCAAGAGC 

AGGGATTGAT CCCTGGCTTC CTGATGCCGC CGTCTGACGG CTCCGGTGTG ACTCCCGGTA 

CCGGGATGCC AGCCGCACCG ATGGTTCCGC CTACCGGATC GCCGGGTGGT GGCCTCCCGG 

CTGACACGGC GGCGCAGCTG ACGTCGGCTG GGCGGGAAGC CGCAGCGCTG TCGGGCGACG 

TGGCGGTCAA AGCGGCATCG CTCGGTGGCG GTGGAGGCGG CGGGGTGCCG TCGGCGCCGT 

TGGGATCCGC GATCGGGGGC GCCGAATCGG TGCGGCCCGC TGGCGCTGGT GACATTGCCG 

GCTTAGGCCA GGG AAGGGCC GGCGGCGGCG CCGCGCTGGG CGGCGGTGGC ATGGGAATGC 

CGATGGGTGC CGCGCATCAG GGACAAGGGG GCGCCAAGTC CAAGGGTTCT CAGCAGGAAG 

ACGAGGCGCT CTACACCGAG GATCGGGCAT GGACCGAGGC CGTCATTGGT AACCGTCGGC 

GCCAGGACAG TAAGGAGTCG AAGTGAGCAT GGACGAATTG GACCCGCATG TCGCCCGGGC 

GTTGACGCTG GCGGCGCGGT TTCAGTCGGC CCTAGACGGG ACGCTCAATC AGATGAACAA 

CGGATCCTTC CGCGCCACCG ACGAAGCCGA GACCGTCGAA GTGACGATCA ATGGGCACCA 

GTGGCTCACC GGCCTGCGCA TCGAAGATGG TTTGCTGAAG AAGCTGGGTG CCGAGGCGGT 

GGCTCAGCGG GTCAACGAGG CGCTGCACAA TGCGCAGGCC GCGGCGTCCG CGTATAACGA 

CGCGGCGGGC GAGCAGCTGA CCGCTGCGTT ATCGGCCATG TCCCGCGCGA TGAACGAAGG 

AATGGCCTAA GCCCATTGTT GCGGTGGTAG CGACTACGCA CCGAATGAGC GCCGCAATGC 

GGTCATTCAG CGCGCCCGAC ACGGCGTGAG TACGCATTGT CAATGTTTTG ACATGGATCG 

GCCGGGTTCG GAGGGCGCCA TAGTCCTGGT CGCCAATATT GCCGCAGCTA GCTGGTCTTA 

GGTTCGGTTA CGCTGGTTAA TTATGACGTC CGTTACCA 

(2) INFORMATION FOR SEQ ID NO:184: 

(i) SEQUENCE CHARACTERISTICS ; 

(A) LENGTH: 460 amino acids 
(B> TYPE: amino acid 



720 
780 
840 
900 
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1020 
1080 
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(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134: 

Met Thr Gin Ser Gin Thr Val Thr Val Asp Gin Gin Glu lie Leu Asn 
5 10 15 



Arg Ala Asn Glu Val Glu Ala Pro Met Ala Asp Pro Pro Thr Asp Val 

20 25 3 0 

Pro lie Thr Pro Cys Glu Leu Thr Ala Ala Lys Asn Ala Ala Gin Gin 

35 40 45 

Leu Val Leu Ser Ala Asp Asn Met Arg Glu Tyr Leu Ala Ala Gly Ala 

50 55 so 

Lys Glu Arg Gin Arg Leu Ala Thr Ser Leu Arg Asn Ala Ala Lys Ala 



70 



75 



80 



Tyr Gly Glu Val Asp Glu Glu Ala Ala Thr Ala Leu Asp Asn Asp Gly 
85 90 95 

Glu Gly Thr Val Gin Ala Glu Ser Ala Gly Ala Val Gly Gly Asp Ser 
100 105 110 

Ser Ala Glu Leu Thr Asp Thr Pro Arg Val Ala Thr Ala Gly Glu Pro 
115 120 125 



Asn Phe Met Asp Leu Lys Glu Ala Ala Arg Lys Leu Glu Thr Gly Asp 
130 135 140 

Gin Gly Ala Ser Leu Ala His Phe Ala Asp Gly Tro Asn Thr Phe Asn 
150 155 " 160 

Leu Thr Leu Gin Gly .Asp Val L ys ^ ?he ^ Qly phe ^ ^ ^ 

165 170 175 

Glu Gly Asp Ala Ala Thr Ala Cys Glu Ala Ser Leu Asp Gin Gin Arg 
180 185 190 

Gin Trp He Leu His Met Ala Lys Leu Ser Ala Ala Met Ala Lys Gin 
195 200 205 

Ala Gin Tyr Val Ala Gin Leu His Val Trp Ala Arg Arg Glu His Pro 
210 215 220 

Thr Tyr Glu Asp He Val Gly Leu Glu Arg Leu Tyr Ala Glu Asn Pro 
" 230 235 240 

Ser Ala Arg Asp Gin He Leu Pro Val Tyr Ala Glu Tyr Gin Gin Arg 
245 2S0 25S 

Ser Glu - uys val Leu Thr Glu Tyr Asn Asn Lys Ala Ala Leu Glu Pro 
260 2 65 270 



WO 99/42076 



PCT7US99/03268 



175 



Val Asn Pro Pro Lys Pro Pro Pro Ala He Lys n. Asp Pro Pro Pro 

5 280 285 

Pro Pro am Glu Gln Gly Leu Ilg prQ ^ ^ ^ ^ ^ ^ ^ 

295 300 
Asp Gly ser Gly Val Thr Pro Gly Thr Gly Met Pro ^ ^ ^ ^ 
310 3 15 32 o 

Val Pro Pro Thr Gly Ser Pro Gly Gly Gly Uu Pro Ala Asp Thr Ala 
325 330 335 

Ala Gin Leu Thr Ser Ala Gly Arg Glu Ala Ala Ala Leu Ser Gly Asp 

345 3S0 

val Ala val Lys Ala Ala Ser Leu Gly Gly Gly Gly Gly Gly Glv Val 

360 3S5 

Pro Ser Ala Pro Leu Gly Ser Ala lie Gly Gly Ala Glu Ser Val Arg 

37 ° 380 

Pro Ala Gly Ala Gly Asp He Ala Gly Leu Gly Gln Gly ^ Ma Qy 

395 400 

Gly Gly Ala Ala Leu Gly Gly Gly Gly Met Gly Met Pro „ eC Gly Ala 

5 «° 415 

Ala His Gin Gly Gin Gly Gly Ala Lys Ser Lys Gly Ser Gin Gin Glu 

420 «S 4 3 0 

Asp Glu Ala Leu Tyr Thr Glu Asp Arg Ala Trp Thr Glu Ala Val lie 



440 



445 



Gly Asn Arg ^ ^ g Gln Agp ^ ^ ^ ^ ^ 



455 



460 



(2) INFORMATION FOR SEQ ID NO: 185: 

(i) SEQUENCE CHARACTERISTICS: 

(Ai LENGTH: 277 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 185: 

Ala Gly Asn Val Thr Ser Ala Ser Gly Pro His Arg Phe Gly Ala Pro 

10 15 

Asp Arg Gly Ser Gln Arg Arg Arg .Arg Hi. Pro Ala Ala Ser Thr Ala 

25 30 
Thr Glu Arg Cys Arg Phe Asp Arg His Val Ala Arg Gln Arg Cys Gly 



40 45 
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Phe Pro Pro Ser Arg Arg Gin Leu Arg Arg Arg Val Ser Arg Glu Ala 

Thr Thr Arg Arg Ser Gly Arg Arg Asn His Arg Cys Gly Trp His Pro 

70 75 80 

Gly Thr Gly Ser His Thr Gly Ala Val Arg Arg Arg His Gin Glu Ala 

85 9° 95 

Arg Asp Gin Ser Leu Leu Leu Arg Arg Arg Gly Arg Val Asp Leu Asp 

105 110 

Gly Gly Gly Arg Leu Arg Arg Val Tyr Arg Phe Gin Gly Cys Leu Val 
15 12 ° 125 

val val Phe Gly Gin His Leu Leu Arg Pro Leu Leu lie Leu Arg Val 

135 140 

His Arg Glu Asn Leu Val Ala Gly Arg Arg Val Phe Arg Val Lys Pro 

150 "5 160 

Phe Glu Pro Asp Tyr Val Phe He Ser Arg Met Phe Pro Pro Ser Pro 

155 W 175 

His Val Gin Leu Arg Asp H e Leu Ser Leu Leu Gly H iS Arg Ser Ala 

185 190 

Gin P he Gly His Val Glu Tyr Pro Leu Pro Leu Leu lie Glu Arg Ser 

200 205 

Leu Ala ser Gly Ser Arg He Ala Phe Pro Val Val Lys Pro Pro Glu 

215 220 

Pro Leu Asp Val Ala Leu Gin Arg Gin Val Glu Ser Val Pro Pro lie 

230 -i -j c 

* 35 240 

Arg Lys Val Arg Glu Arg Cys Ala Leu Val Ala Arg Phe Glu Leu Pro 

24S 250 255 

Cys Arg Phe Phe Glu He His Glu Val Gly Phe Thr Gly Arg Glv His 



265 

Pro Arg Arg He Gly 
275 

(2) INFORMATION FOR SEQ ID NO: 186: 

(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 192 amino acids 
(BJ TYPE: amino acid 
(C) STRANDEDNESS : 
CD) TOPOLOGY: linear 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 186: 



270 
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Arg Val Ala Ala Ser Phe He Asp Trp Leu Asp Ser Pro Asp Se r Pro 

10 1S 

L eu Asp Pro ser Leu Val Ser Ser Leu Leu Asn Ala Val Ser Cys Gly 

25 30 

Ala Glu Ser Ser Ala c^>- n 

^ ber Ala Ser Ser Ser Ala Arg Ser Gly Asn Gly Ser Arg 
40 45 

Trp Thr Ser Met Pro Ser rciv a 

5q Pro ser Gly Thr Arg Pro Gly Pro Arg Arg Ala Thr 

55 60 
Ser Arg Asp Asp Arg Arg Ser Ala Thr Ser Val lie Pro Ser Arg Arg 

75 80 

ser val Ala Pro Arg Ala Glu Phe Gly Thr Arg Leu Ala Ser Hxs Arg 

90 9S 

Ala ser Pro Ser Asn Ala Cys Pro Val Arg He Val Thr Ser Ala Ser 

105 no 

Gly Arg Pro He Ser Ser Pro Pro He Val Arg Ser Arg Ser Cys Val 

120 125 

ASP Lys Asn C-ly Arg Arg C ys Ala Ser Gly Tyr Arg Arg Leu Asn Arg 

135 140 
Ala Arg Ser Ser Ser lie Ala Ala Arg Cys Arg Thr He Gly Thr Phe 
150 "5 160 

Arg Arg Ser Arg Tyr Ser Ala Ser Me t Arg Val Ser Thr Asn Ser Pro 

165 17 ° 175 

Hi- Val Thr Hi. Gly Val Ala Pro Gly Val Thr Arg Arg He Gly Gly 



"5 190 



(2) INFORMATION FOR SEQ ID NO: 137 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 196 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 187: 

Gin Glu Arg Pro Gin Met Cys" Gin Arg Val Ser Glu He Glu Pro Arg 

io ls 

Thr Gin Phe Phe Asn Arg Cvs Ala Leu d^-, u - ^ „ 

y uys Axa Leu Pro His Tyr Trp His Phe Pro 

25 30 

Ala Val Ala Val Phe Ser Lys His Ala s^r r » 

' nis A - La Se ^ Leu Asp Glu Leu Ale 



la Pro 
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178 



40 45 



Arg Asn Pro Arg Arg Ser Ser Arg ^ ^ ^ ^ ^ ^ ^ ^ 

55 60 

lie Phe Ala Ala Thr Leu Val Ala Val Asp Pro Pro Leu Arg Gly Ala 

70 7 S 80 

Gly Gly Glu Ala Asp Gin Leu Ile ^ Leu Qly ^ Cys 

85 9° 95 

Ala Gly Arg Val Arg Arg Gly Gin Glu Leu His His Arg His Arg His 

105 110 

Gin Gly Ala Ala Pro Asp Leu Arg Arg Arg Arg Arg His Arg Arg Val 

120 125 

Gin Gin His Arg ^ Leu Qln ^ ^ ^ ^ ^ ^ ^ ^ ^ 



135 140 



Gin Thr Ala His His Arg Arg Phe Ala Arg Thr 



14 5 Vcn S 3 ACg Thr ^9 Val ^3 His 



150 



155 



ISO 

His Val Arg Gly Pro Ser Asn Hi, Arg Arg Arg Arg Val Tyr Axg Gly 

165 170 175 

Arg His Ser Gly Ma Gly Gly C ys Pro Ala Gly Gly Ala Gly Ser Val 



185 



190 



180 

Gly Gly Ser Ala 
195 

(2) INFORMATION FOR SEQ ZD NO: 18 8: 

Ci) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 311 ammo acids 

(B) TYPE: amxno acid 

(C) STRAND EDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:188: 

Val Arg Cys Gly Thr Leu Val Pro Val Pro Met Val Glu Phe Leu Thr 
5 10 15 

Ser Thr Asn Ala P ro ser Leu Pro Ser Ala Tyr Ala Glu Val Asp Lys 

25 30 

Leu He Gly Leu Pro Ala GlyThr Ala Lys Arg Trp lie Asn Gly Tyr 

40 45 

Glu Arg Gly Gly Lys Asp HlS Pro Pro lie Leu Arg Val Thr Pro Gly 

55 60 

Ala Thr Pro Tro Va 1 Thr t>~~ ~t «i 

ro irp va. Thr Trp „i y Glu Phe Val Glu Thr Arg Met Leu 
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65 ™ 75 

75 80 

Ala Glu Tyr Arg Asp Arg Arg Lys Val Pro lie Val Arg Gin Arg Ala 

85 90 95 

Ala lie Glu gj u ^ Arg Ma ^ phe ^ ^ ^ ^ prQ ^ ^ 



105 



110 



His Leu Arg Pro Phe Leu Ser Thr His Glu Arg Asp Leu Thr Met Gly 

120 125 

Gly Glu Glu He Gly Leu Pro Asp Ala Glu Val Thr lie Arg Thr Gly 

135 140 

Gin Ala Leu Leu Gly Asp Ala Arg Trp Leu Ala Ser Leu Val Pro Asn 
150 155 160 

Ser Ala Arg Gly Ala Thr Leu Arg Arg Leu Gly lie Thr Asp Val Ala 
165 170 175 

Asp Leu Arg Ser Ser Arg Glu Val Ala Arg Arg Gly Pro Gly Arg Val 

185 19Q 

Pro Asp Gly lie Asp Val His Leu Leu Pro Phe Pro Asp Leu Ala Asp 

200 205 

Asp Asp Ala Asp Asp Ser Ala Pro His Glu Thr Ala Phe Lys Arg Leu 

215 220 

Leu Thr Asn Asp Gly ser Asn Gly Glu Ser Gly Glu Ser Ser Gin Ser 
230 23S 240 

He Asn Asp Ala Ala Thr Arg Tyr Met Thr Aso Glu Tyr Arg Gin Phe 

245 250 255 

Pro Thr Arg Asn Gly Ala Gin Arg Ala Leu His Arg Val Val Thr Leu 

250 2S5 270 

Leu Ala Ala Gly Arg P ro Val Leu Thr Hi. Cys Phe Ala Gly Lys Asp 

280 285 

Arg Thr Gly Phe Val Val Ala Leu Val Leu Glu Ala Val Gly Leu Asp 



295 

Arg Asp Val He Val Ala Asp 
305 310 

(2) INFORMATION FOR SEQ ID NO: 18 9: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2072 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



300 
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(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 189: 
CTCGTGCCGA TTCGGCACGA GCTGAGCAGC CCAAGGGGCC GTTCGGCGAA GTCATCGAGG 
CATTCGCCGA CGGGCTGGCC GGCAAGGGTA AGCAAATCAA CACCACGCTG AACAGCCTGT 
CGCAGGCGTT GAACGCCTTG AA7GAGGGCC GCGGCGACTT CTTCGCGGTG GTACGCAGCC 
TGGCGCTATT CGTCAACGCG CTACATCAGG ACGACCAACA GTTCGTCGCG TTGAACAAGA 
ACCTTGCGGA GTTCACCGAC AGGTTGACCC ACTCCGATGC GGACCTGTCG AACGCCATCC 
AGCAATTCGA CAGCTTGCTC GCCGTCGCGC GCCCGTTCTT CGCCAAGAAC CGCGAGGTGC 
TGACGCATGA CGTCAATAAT CTCGCGACCG TGACCACCAC GTTGCTGCAG CCCGATCCGT 
TGGATGGGTT GGAGACCGTC CTGCACATCT TCCCGACGCT GGCGGCGAAC ATTAACCAGC 
TTTACCATCC GACACACGGT GGCGTGGTGT CGCTTTCCGC GTTCACGAAT TTCGCCAACC 
CGATGGAGTT CATCTGCAGC TCGATTCAGG CGGGTAGCCG GCTCGGTTAT CAAGAGTCGG 
CCGAACTCTG TGCGCAGTAT CTGGCGCCAG TCCTCGATGC GATCAAGTTC AACTACTTTC 
CGTTCGGCCT GAACGTGGCC AGCACCGCCT CGACACTGCC TAAAGAGATC GCGTACTCCG 
AGCCCCGCTT GCAGCCGCCC AACGGGTACA AGGACACCAC GGTGCCCGGC ATCTGGGTGC 
CGGATACGCC GTTGTCACAC CGCAACACGC AGCCCGGTTG GGTGGTGGCA CCCGGGATGC 
AAGGGGTTCA GGTGGGACCS ATCACGCAGG GTTTGCTGAC GCCGGAGTCC CTGGCCGAAC 
TCATGGGTGG 7CCCGATATC GCCCCTCCGT CGTCAGGGCT GCAAACCCCG CCCGGACCCC 
CGAATGCGTA CGACGAGTAC CCCGTGCTGC CGCCGATCGG TTTACAGGCC CCACAGGTGC 
CGATACCACC GCCGC=TCCT GGGCCCGACG TAATCCCGGG TCCGGTGCCA CCGGTCTTGG 
CGGCGATCGT GTTCCCAAGA GATCGCCCGG CAGCGTCGGA AAACTTCGAC TACATGGGCC 
TCTTGTTGCT GTCGCCGGGC CTGGCGACCT TCCTGTTCGG GGTGTCATCT AGCCCCGCCC 
GTGGAACGAT GGCCGATCGG CACGTGTTGA TACCGGCGAT CACCGGCCTG GCGTTGATCG 
CGGCATTCGT CGCACATTCG TGGTACCGCA CAGAACATCC GCTCATAGAC ATGCGCTTGT 
TCCAGAACCG AGCGGTCGCG CAGGCCAACA TGACGATGAC GGTGCTCTCC CTCGGGCTGT 
TTGGCTCCTT CTTGCTGCTC CCGAGCTACC "TCCAGCAAGT GTTGCACCAA TCACCGATGC 
AATCGGGGGT GCATATCATC CCACAGGGCC TCGGTGCCAT GCTGGCGATG CCGATCGCCG 
GAGCGATGAT GGACCGACGG GGACCGGCCA AGATCGTGCT GGTTGGGATC ATGCTGATCG 
CTGCGGGGTT GGGCACCTTC GCCTTTGGTG TCGCGCGGCA AGCGGACTAC TTACCCATTC 



60 
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180 
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TGCCGACCGG 


GCTGGCAATC ATGGGCATGG GCATCWrTr 


CTCCATGATG 


CCACTGTCCG 


1680 


GGGCGGCAGT 


GCAGACCCTG GCCCCACATC AGflTrrrrrr 


CGGTTCGACG 


CTGATCAGCG 


1740 


TCAACCAGCA 


GGTGGGCGGT TCGATAf^na rr . rr »--,-.^ 

•o-wwuj. 4 ^.uniiiuvjVa/i L. L. CjL-ACTTGAT 


GTCGGTGCTG 


CTCACCTACC 


iaoo 


AGTTCAATCA 


CAGCGAAATC ATCGCTACTG CAAAGAAAGT 


CGCACTGACC 


CCAGAGAGTG 


I860 


GCGCCGGGCG 


GGGGGCGGCG GTTGACCCTT CCTCGCTACC 


GCGCCAAACC AACTTCGCGG 


1920 


CCCAACTGCT 


GCATGACCTT TCGCACGCCT ACGCGGTGGT 


ATTCGTGATA 


GCGACCGCGC 


1980 


TAGTGGTCTC 


GACGCTGATC CCCGCGGCAT TCCTGCCGAA 


ACAGCAGGCT 


AGTCATCGAA 


2040 


GAGCACCGTT 


GCTATCCGCA TGACGTCTGC XT 






2072 


(2) INFORMATION FOR SEQ ID NO: 190: 









(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1923 base pa 
£BJ TYPE: nucleic acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 



TCACCCCGGA 


GAAGTCGTTC 


GTCGACGACC 


TGGACATCGA 


CTCGCTGTCG 


ATGGTCGAGA 


60 


TCGCCGTGCA 


GACCGAGGAC 


AAGTACGGCG 


TCAAGATCCC 


CGACGAGGAC 


CTCGCCGGTC 


12 0 


TGCGTACCGT 


CGGTGACGTT 


GTCGCCTACA 


TCCAGAAGCT 


CGAGGAAGAA 


AACCCGGAGG 


180 


CGGCTCAGGC 


GTTGCGCSCG 


AAGATTGAGT 


CGGAGAACCC 


CGATGCGGCA 


CGAGCAGATC 


240 


GGTGCGTTTC 


ACCCACATCG 


CAAGCTCGAG 


ACGCCCGTCG 


TCCTCTTGCA 


CGCTCAGCCA 


300 


GGTTGGCGTG 


TCGCCGCCTT 


CCAGCAAGTG 


TTCCCACCAC 


ACGAAGGGAC 


CCTCGCGAAA 


360 


GGTGACTGAT 


CCGCGGACCA 


CATAGTCGAT 


GCCACCGTGG 


CTGACAATTG 


CGCCGGGTCC 


420 


GAGTTGGCGG 


GGGCCGAATT 


GCGGCATTGC 


GTCGAAGGCC 


AGCGGATCCC 


GGCGCCCGCC 


480 


CGGCGTGGCT GGTGTTTTGG 


GCCGCCGGAT 


GGCCACGACG 


AGAACGACGA 


TGGCGGCGAT 


54 0 


GAACAGCGCC 


ACGGCAATCA 


CGACCAGCAG 


ATTTCCCACG 


CATACCCTCT 


CGTACCGCTG 


600 


CGCCGCGGTT 


GGTCGATCGG 


TCGCATATCG 


ATGGCGCCGT 


TTAACGTAAC 


AGCTTTCGCG 


660 


GGACCGGGGG 


TCACAACGGG 


CGAGTTGTCC 


GGCCGGGAAC 


CCGGCAGGTC 


TCGGCCGCGG 


720 


TCACCCCAGC 


TCACTGGTGC 


ACCATCCGGG 


TGTCGGTGAG 


CGTGCAACTC 


AAACACACTC 


780 


AACGGCAACG 


GTTTCTCAGG 


TCACCAGCTC 


AACCTCGACC 


CGCAATCGCT 


CGTACGTTTC 


840 
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GACCGCGCGC AGGTCGCGAG TCAGCAGCTT TGCGCCGGCA GCTTTCGCCG TGAAGCCGAC 
CAGGGCATCG TAGGTTGCGC CACCGGTGAC ATCGTGCTCG GCGAGGTGGT CGGTCAAGCC 
GCGATATGAG CAGG CATC CA GTGCCAGGTA GTTGCTGGAG GTGATGTCCG CCAAGTAGGC 
GTGGACGGCA ACAGGGGCAA TACGATGCGG CGGTGGTAGC CGGGTCAAGA CCGAATAGGT 
TTCCACAGCC GCGTGCGCGA TCAGATGGAC GCCACGGTTG AGCGCGCGCA CGGCGGCCTC 
GTGCCCTTCG TGCCAGGTCG CGAATCCGGC AACCAGCACG CTGGTGTCTG GTGCGATCAC 
CGCCGTGTGC GATCGAGCGT TTCCCGAACG ATTTCGTCGG TCAACGGGGG CAGGGGACGT 
TCTGGCCGTG CGACGAGAAC CGAGCCTTCC CGAACGAGTT CGACACCGGT CGGGGCCGGC 
TCAATCTCGA TGCGCCCATC GCGCTCGGTG ATCTCCACCT GGTCGTTCCC GCGCAAGCCA 
AGGCGCTCGC GAATCCGCTT GGGAATCACC AGACGTCCTG CGACATCGAT GGTTGTTCGC 
ATGGTAGGAA ATTTAC CATC GCACGTTCCA TAGGCGTGTC CTGCGCGGGA TGTCGGGACG 
ATCCGCTAGC GTATCGAACG ATTGTTTCGG AAATGGCTGA GGGAGCGTGC GGTGCGGGTG 
ATGGGTGTCG ATCCCGGGTT GACCCGATGC GGGCTGTCGC TCATCGAGAG TGGGCGTGGT 
CGGCAGCTCA CCGCGCTGGA TGTCGACGTG GTGCGCACAC CGTCGGATGC GGCCTTGGCG 
CAGCGCCTGT TGGCCATCAG CGATGCCGTC GAGCACTGGC TGGACACCCA TCATCCGGAG 
GTGGTGGCTA TCGAACGGGT GTTCTCTCAG CTCAACGTGA CCACGGTGAT GGGCACCGCG 
CAGGCCGGCG GCGTGATCGC CCTGGCGGCG GCCAAACGTG GTGTCGACGT GCATTTCCAT 
ACCCCCAGCG AGGT CAAGGC GGCGGTCACT GGCAACGGTT CCGCAGACAA GGCTCAGGTC 
ACC 

(2) INFORMATION FOR SEQ ID NO: 191 : 

— } SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1055 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:191: 
CTGGCGTGCC AGTGTCACCG GCGATATGAC GTCGGCATTC AATTTCGCGG CCCCGCCGGA 
CCCGTCGCCA CCCAATCTGG ACCACCCGGT CCGTCAATTG CCGAAGGTCG CCAAGTGCGT 
GCCCAATGTG GTGCTGGGTT TCTTGAACGA AGGCCTGCCG TATCGGGTGC CCTACCCCCA 
AACAACGCCA GTCCAGGAAT CCGGTCCCGC GCGGCCGATT CCCAGCGGCA TCTGCTAGCC 



900 
960 
1020 
1080 
1140 
1200 
12S0 
1320 
1380 
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1680 
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60 
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GGGGATGGTT CAGACGTAAC GGTTGGCTAG GTCGAAACCC GCGCCAGGGC CGCTGGACGG 
GCTCATGGCA GCGAAATTAG AAAACCCGGG ATATTGTCCG CGGATTGTCA TACGATGCTG 
AGTGCTTGGT GGTTCGTGTT TAGCCATTGA GTGTGGATGT GTTGAGACCC TGGCCTGGAA 
GGGGACAACG TGCTTTTGCC TCTTGGTCCG CCTTTGCCGC CCGACGCGGT GGTGGCGAAA 
CGGGCTGAGT CGGGAATGCT CGGCGGGTTG TCGGTTCCGC TCAGCTGGGG AGTGGCTGTG 
CCACCCGATG ATTATGACCA CTGGGCGCCT GCGCCGGAGG ACGGCGCCGA TGTCGATGTC 
CAGGCGGCCG AAGGGGCGGA CGCAGAGGCC GCGGCCATGG ACGAGTGGGA TGAGTGGCAG 
GCGTGGAACG AGTGGGTGGC GGAGAACGCT GAACCCCGCT TTGAGGTGCC ACGGAGTAGC 
AGCAGCGTGA TTCCGCATTC TCCGGCGGCC GGCTAGGAGA GGGGGCGCAG ACTGTCGTTA 
TTTGACCAGT GATCGGCGGT CTCGGTGTTC CCGCGGCCGG CTATGACAAC AGTCAATGTG 
CATGACAAGT TACAGGTATT AGGTCCAGGT TCAACAAGGA GACAGGCAAC ATGGCAACAC 
GTTTTATGAC GGATCCGCAC GCGATGCGGG ACATGGCGGG CCGTTTTGAG GTGCACGCCC 
AGACGGTGGA GGACGAGGCT CGCCGGATGT GGGCGTCCGC GCAAAACATC TCGGGNGCGG 
GCTGGAGTGG CATGGCCGAG GC3ACCTCGC TAGAC 
(2) INFORMATION FOR SEQ ID NO: 192: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 359 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 192: 
CCGCCTCGTT GTTGGCATAC TCCGCCGCGG CCGCCTCGAC CGCACTGGCC GTGGCGTGTG 
TCCGGGCTGA CCACCGGGAT CGCCGAACCA TCCGAGATCA CCTCGCAATG ATCCACCTCG 
CGCAGCTGGT CACCCAGCCA CCGGGCGGTG TGCGACAGCG CCTGCATCAC CTTGGTATAG 
CCGTCGCGCC CCAGCCGCAG GAAGTTGTAG TACTGGCCCA CCACCTGGTT ACCGGGACGG 
GAGAAGTTCA GGGTGAAGGT CGGCATGTCG CCGCCGAGGT AGTTGACCCG GAAAACCAGA 
TCCTCCGGCA GGTGCTCGGG CCCGCGCCAC ACGACAAACC CGACGCCGGG ATAGGTCAG 
(2) INFORMATION FOR SEQ ID NO: 193; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 50 base pairs 



300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1055 



60 
120 
180 
240 
300 
359 
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(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 193: 
AACGGGCCCG TGGGCACCGC TCCTCTAAGG GCTCTCGTTG GTCGCATGAA GTGCTGGAAG 
GATGCATCTT GGCAGATTCC CGCCAGAGCA AAACAGCCGC TAGTCCTAGT CCGAGTCGCC 
CGCAAAGTTC CTCGAATAAC TCCGTACCCG GAGCGCCAAA CCGGGTCTCC TTCGCTAAGC 
TGCGCGAACC ACTTGAGGTT CCGGGACTCC TTGACGTCCA GACCGATTCG TTCGAGTGGC 
TGATCGGTTC GCCGCGCTGG CGCGAATCCG CCGCCGAGCG GGGTGATGTC AACCCAGTGG 
GTGGCCTGGA AGAGGTGCTC TACGAGCTGT CTCCGATCGA GGACTTCTCC 
(2) INFORMATION FOR SEQ ID NO: 194 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 679 amino acids 

(B) TYPE: amino acid 
tC) STRANDEDNESS: 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 194 : 

Glu Gin Pro Lys Gly Pro P he Gly Glu Val lie Glu Ala Phe Ala Asp 

10 1S 



Gly Leu Ala Gly Lys Gly Lys Gla Ile ^ Thr Thr ^ ^ ^ ^ 
*° 25 30 

Ser Gin Ala Leu Asn Ala Leu Asn Glu Gly Axg Gly Asp Phe Phe Ala 
35 40 45 

Val val Arg Ser Leu Ala Leu Phe Val Asn Ala Leu His Gin Asp Asp 

55 60 

Gin Gin Phe Val Ala Leu Asn Lys Asn Leu Ala Glu Phe Thr Asp Arg 
70 7 5 80 

I-u Thr His Ser Asp Ala Asp Leu Ser Asn Ala lie Gin Gin Phe Asp 
85 90 95 

Ser Leu Leu Ala Val Ala Arg Pro Phe Phe Ala Lys Asn Arg Glu Val 
100 105 110 

Leu Thr His Asp Val Asn Asn Leu Ala Thr Val Thr Thr Thr Leu Leu 
15 120 125 

Gin Pro Asp Pro Leu Asp Gly Leu Glu Thr Val Leu ^ ^ ^ 

XJU 135 



60 
120 
180 
240 
300 
350 



140 
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Thr Leu Ala Ala Asn lie Asn Gin Leu Tyr His Pro Thr His Gly Gly 
145 150 155 X6 J 

Val Val Ser Leu Ser Ala Phe Thr Asn Phe Ala Asn Pro Met Glu Phe 
165 170 175 

lie Cys Ser Ser lie Gin Ala Gly Ser Arg Leu Gly Tyr Gin Glu Ser 



180 



185 



190 



Ala Glu Leu Cys Ala Gin Tyr Leu Ala Pro Val Leu Asp Ala lie 



195 



200 



Lys 



205 



Phe Asn Tyr Phe Pro Phe Gly Leu Asn Val Ala Ser Thr Ala 



210 



215 220 
Leu Pro Lys Glu He Ala Tyr Ser Glu Pro Arg Leu Gin Pro 



225 



230 



235 



Ser Thr 



Pro Asn 
240 



Gly Tyr Lys Asp Thr Thr Val Pro Gly H e T rp Val Pro Asp Thr Pro 
245 250 



Leu Ser His Arg Asn Thr Gin Pro Gly Trp Val Val Ala Pro 



260 



Gin Gly Val Gin Val Gly Pro lie Thr Gin Gly Leu Leu Thr 



275 



280 



255 



Gly Met 



Pro Glu 



285 



Ser Leu Ala Glu Leu Met Gly Gly Pro Asp He Ala Pro Pro Ser Ser 



295 



300 



Gly Leu Gin Thr Pro Pro Gly Pro Pro Asn Ala Tyr Asp Glu Tyr Pro 



310 



315 



320 



Val Leu Pro Pro He Gly Leu Gin Ala Pro Gin Val Pro lie Pro Pro 



325 



330 



335 



Pro Pro Pro Gly Pro Asp Val He Pro Gly Pro Val Pro Pro Val 



340 



Leu 



345 



350 



Ala Ala lie Val Phe Pro Arg Asp Arg Pro Ala Ala Ser Glu Asn Phe 
355 360 365 

Asp Tyr Met Gly Leu Leu Leu Leu Ser Pro Gly Leu Ala Thr Phe Leu 

370 375 

Phe Gly Val Ser Ser Ser Pro Ala Arg Gly Thr Met Ala Asp Arg His 



390 



395 



Val Leu He Pro Ala He Thr X31y Leu Ala Leu He Ala Ala 



405 



410 



400 

Phe Val 
415 



Ala His Ser Trp Tyr Arg Thr Glu His Pro Leu He Asp Met Arg Leu 



420 



425 



430 



Phe Gin Asn Arg Ala Val Ala Gin Ala Asn Met Thr Met Thr Val Leu 
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435 440 445 

Ser Leu Gly Leu Phe Gly Ser Phe Leu Leu Leu Pro Ser Tyr Leu Gin 
450 455 



460 



Gin Val Leu His Gin Ser Pro Met Gin Ser Gly Val His lie lie Pro 
465 470 475 480 

Gin Gly Leu Gly Ala Met Leu Ala Met Pro lie Ala Gly Ala Met Met 



485 



490 



495 



Asp Arg Arg Gly Pro Ala Lys lie Val Leu Val Gly He Met Leu lie 
500 505 510 



Ala Ala Gly Leu Gly rhr Phe Ala Phe Gly Val Ala Arg Gin Ala 



515 



520 525 



Asp 



Tyr Leu Pro lie Leu Pro Thr Gly Leu Ala lie Met Gly Met Gly Met 

530 535 540 

Gly Cys Ser Met Met Pro Leu Ser Gly Ala Ala Val Gin Thr Leu Ala 

550 555 560 

Pro His Gin lie Ala Arg Gly Ser Thr Leu lie Ser Val Asn Gin Gin 

555 570 575 

Val Gly Gly Ser He Gly Thr Ala Leu Met Ser Val Leu Leu Thr 



580 



585 5 9o 



Tyr 



Gin Phe Asn His Ser Glu He He Ala Thr Ala Lys Lys Val Ala Leu 



595 



500 



»05 



Thr Pro Glu Ser Gly Ala Gly Arg Gly Ala Ala Val Asp Pro Ser Ser 
610 SIS S20 

Leu Pro Arg Gin Thr Asn Phe Ala Ala Gin Leu Leu His Asr> Leu Ser 
525 630 "5 * 640 

His Ala Tyr Ala Val Val Phe Val He Ala Thr Ala Leu Val Val Ser 
645 650 

Thr Leu He Pro Ala Ala Phe Leu Pro Lys Gin Gin Ala Ser His Arg 
660 S65 570 

Arg Ala Pro Leu Leu Ser Ala 
675 



(2) INFORMATION FOR SEQ ID NO: 195: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 120 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 195: 

Thr Pro Glu Lys Ser Phe Val Asp Asp Leu Asp lie Asp Ser Leu Ser 

5 10 15 

«et Val Glu lie Ala Val Gin T hr Glu ^ Lys ^ Qly ^ ^ ^ 

25 30 
Pro Asp Glu Asp Leu Ala Gly Leu Arg Thr Val Gly Asp Val Val Ala 

40 45 

Tyr lie Gin Lys Leu Glu Glu Glu Asn Pro Glu Ala Ala Gin Ala Leu 

55 60 

Arg Ala Lys lie Glu Ser Glu Asn Pro Asp Ala Ala Arg Ala Asp Arg 

° 7 5 80 

Cys Val ser Pro Thr Ser Gin Ala Arg Asp Ala Arg Arg Pro Leu Ala 



90 95 



Arg Ser Ala Arg Leu Ala Cys Arg Arg Leu Pro Ala Ser Val Pro Thr 

° 110 
Thr Arg Arg Asp Pro Arg Glu Arg 
i15 120 

(2) INFORMATION FOR SEQ ID NO: 196: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 89 amino acids 

(B) TYPE: amino acid 
v C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

!xi) SEQUENCE DESCRIPTION: SEQ ID NO : 196 ; 

Leu Ala Cys Gin Cys His Arg Arg Tyr Asp Val Gly He Gin Phe Arg 

10 15 

Gly Pro Ala Gly Pro Val Ala Thr Gin Ser Gly Pro Pro Gly Pro Ser 

*° 25 30 

He Ala Glu Gly Arg Gin Val Arg Ala Gin Cys Gly Ala Gly Phe Leu 
35 40 45 

Glu Arg Arg Pro Ala Val Ser Gly Ala Leu Pro Pro Asn Asn Ala Ser 

55 60 

?ro Gly He Arg Ser Arg Ala Ala Asp Ser Gin Arg His Leu Leu Ala 



75 



80 



Gly Asp Gly Ser Asp Val Thr Val Gly 
85 

INFORMATION FOR SEQ ID NO: 197 : 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 119 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNES S : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:197 : 

Ala Ser Leu Leu Ala Tyr Ser Ala Ala Ala Ala Ser Thr Ala Leu Ala 

5 " 15 

Val Ala cys Val Arg Ala Asp His Arg Asp Arg Arg Thr lie Arg Asp 

25 30 

His Leu Ala Mec He His Leu Ala Gin Leu Val Thr Gin Pro Pro Gly 

40 45 

Gly Val Arg Gin Arg Leu His His Leu Gly He Ala Val Ala Pro Gin 

55 60 

Pro Gin Glu Val Val Val Leu Ala His His Leu Val Thr Gly Thr Gly 

70 7 5 80 

Glu val Gin Gly Glu Gly Arg His Val Ala Ala Glu Val Val Asp Pro 

85 *° 95 

Glu Asn Gin lie Leu Arg Gin Val Leu Gly Pro Ala Pro His Asp Lys 

100 105 

Pro Asp Ala Gly He Gly Gin 



110 



115 



(2) INFORMATION FOR SEQ ID NO: 198: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 116 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 198: 

Arg Ala Arg Gly His Arg Ser Ser Lys Gly Ser Arg Trp Ser His Glu 

10 15 

Val Leu Glu Gly Cys He Leu Ala Asp Ser Arg Gin Ser Lys Thr Ala 
20 2S 30 

Ala ser Pro Ser Pro Ser Arg Pro Gin Ser Ser Ser Asn Asn Ser Val 

40 45 

Pro Gly Ala Pro Asn Arg Val Ser Phe Ala Lys Leu Arg Glu Pro Leu 

55 c n 
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Glu val Pro Gly Leu Leu Asp Val Gin Thr Asp Ser Phe Glu Trp Leu 

70 ™ 80 

lie Gly ser Pro Arg Trp Arg Glu Ser Ala Ala Glu Arg Gly Asp Val 



Q 5 an 

90 95 



Asn Pro Val Gly Gly Leu Glu Glu Val Leu Tyr Glu Leu Ser Pro lie 
100 110 

Glu Asp Phe Ser 
115 

(2) INFORMATION FOR SEQ ID NO: 199: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 811 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION; SEQ ID NO:199: 
TGCTACGCAG CAATCGCTTT GGTGACAGAT GTGGATGCCG GCGTCGC^C TGGCGATGGC 
GTGAAAGCCG CCGACGTGTT CGCCGCATTC GGGGAGAACA TCGAACTGCT CAAAAGGCTG 
GTGCGGGCCG CCATCGATCG GGTCGCCGAC GAGCGCACGT GCACGCACTG TCAACACCAC 
GCCGGTGTTC CGTTGCCGTT CGAGCTGCCA TGAGGGTGCT GCTGACCGGC GCGGCCGGCT 
TCATCGGGTC GCGCGTGGAT GCGGCGTTAC GGGCTGCGGG TCACGACGTG GTGGGCGTCG 
ACSCGCTGCT GCCCGCCGCG CACGGGCCAA ACCCGGTGCT GCCACCGGGC TGCCAGCGGG 
TCGACGTGCG CGACGCCAGC GCGCTGGCCC CGTTGTTGGC CGGTGTCGAT CTGGTGTGTC 
ACCAGGCCGC CATGGTGGGT GCCGGCGTCA ACGCCGCCGA CGCACCCGCC TATGGCGGCC 
ACAACGATTT CGCCACCACG GTGCTGCTGG CGCAGATGTT CGCCGCCGGG GTCCGCCGTT 
TGGTGCTGGC GTCGTCGATG GTGGTTTACG GGCAGGGGCG CTATGACTGT CCCCAGCATG 
GACCGGTCGA CCCGCTGCCG CGGCGGCGAG CCGACCTGGA CAATGGGGTC TTCGAGCACC 
GTTGCCCGGG GTGCGGCGAG CCAGTCATCT GGCAATTGGT CGACGAAGAT GCCCCGTTGC 
GCCCGCGCAG CCTGTACGCG GCAGCAAGAC CGCGCAGGAG CACTACGCGC TGGCGTGGTC 
GGAAACGAAT GGCGGTTCCG TGGTGGCGTT 'G 
(2) INFORMATION FOR SEQ ID NO: 200: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 966 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : 3ingle 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:200: 
GTCCCGCGAT GTGGCCGAGC ATGACTTTCG GCAACACCGG CGTAGTAGTC GAAGATATCG 
GACTTTGTGG TCCCGGTGGC GGGATAGAGC ACCTGTCGGC GTTGGTCAGC GTCACCCGTT 
GCTCGGACGC CGAACCCATG CTTTCAACGT AGCCTGTCGG TCACACAAGT CGCGAGCGTA 
ACGTCACGGT CAAATATCGC GTGGAATTTC GCCGTGACGT TCCGCTCGCG GACAATCAAG 
GCATACTCAC TTACATGCGA GCCATTTGGA CGGGTTCGAT CGCCTTCGGG CTGGTGAACG 
TGCCGGTCAA GGTGTACAGC GCTACCGCAG ACCACGACAT CAGGTTCCAC CAGGTGCACG 
CCAAGGACAA C3GAC3CATC CGGTACAAGC GCGTCTGCGA GGCGTGTGGC GAGGTGGTCG 
ACTACCGCGA TCTTGCCCGG GCCTACGAGT CCGGCGACGG CCAAATGGTG GCGATCACCG 
ACGACGACAT CGCCAGCTTG CCTGAAGAAC GCAGCCGGGA GATCGAGGTG TTGGAGTTCG 
TCCCCGCCGC CGACGTGGAC CCGATGATGT TCGACCGCAG CTACTTTTTG GAGCCTGATT 
CGAAGTCGTC GAAATCGTAT GTGCTGCTGG CTAAGACACT CGCCGAGACC GACCGGATGG 
CGATCGTGGA TCGCCCCACC GGCCGTGAAT GCAGGAAAAA TAAGAGCCGC TATCCACAAT 
TCGGCGTCGA GCTCGGCTAC CACAAACGGT AGAACGATCG AGACATTCCC GAGCTGAAGT 
GCGGCGCTAT AGAAGCCGCT CTGCGC3ATT ATCAAACGCA AAATACGCTT ACTCATGCCA 
TCGGCGCTGC TCACCCGATG CGACGTTTTT GCCACGCTCC ACCGCCTGCC GCGCGACCTC 
AAGTGGGCAT GCATCCCACC CGTTCCCGGA AACCGGTTCC GGCGGGTCGG CTCATCGCTT 
CATCCT 

(2) INFORMATION FOR SEQ ZD NO: 2 01: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2367 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:20l: 

CCGCACCGCC GGCAATACCG CCAGCGCCAC CGTTACCGCC GTTTGCGC CG TTGCCCCCGT 

TGCCGCCCGT CCCGCCGGCC CCGCCGATGG AGTTCTCATC GCCAAAAGTA CTGGCGTTGC 

CACCGGAGCC GCCGTTGCCG CCGTCACCGC CAGCCCCGCC GACTCCACCG GCCCCACCGA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
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780 
840 
900 
960 
966 



60 
120 
180 
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CTCCGCCGCT GCCACCGTTG CCGCCGTTGC CGATCAACAT GCCGCTGGCG CCACCCTTGC 
CACCCACGCC ACCGGCTCCG CCCACCCCGC CGACACCAAG CGAGCTGCCG CCGGAGCCAC 
CATCACCACC TACGCCACCG ACCGCCCAGA CACCAGCGAC CGGGTCTTCG TGAAACGTCG 
CGGTGCCACC ACCGCCGCCG TTACCGCCAA CCCCACCGGC AACGCCGGCG CCGCCATCCC 
CGCCGGCCCC GGCGTTGCCG CCGTTGCCGC CGTTGCCGAA CAACAACCCG CCGGCGCCGC 
CGTTGCCGCC CGCGCCGCCG GTCCCGCCGG CGCCGCCGAC GCCAAGGCCG CTGCCGCCCT 
TGCCGCCATC ACCACCCTTG CCGCCGACCA CATCGGGTTC TGCCTCGGGG TCTGGGCTGT 
CAAACCTCGC GATGCCAGCG TTGCCGCCGC TTCCCCCGGG CCCCCCCGTG GCGCCGTCAC 
CACCGATACC ACCCGCGCCA CCGGCGCCAC CGTTGCCGCC ATCACCGAAT AGCAACCCGC 
CGGCGCCACC ATTGCCGCCA GCTCCCCCTG CGCCACCGTC GGCGCCGGAG GCGGCACTGG 
CAGCCCCGTT ACCACCGAAA CCGCCGCTAC CACCGGTAGA GGTGGCAGTG GCGATGTGTA 
CGAAAGCGCC GCCTCCGGCG CCGCCGCTAC CACCCCCACT GCCGGCGGCT ACACCGTCGG 
ACCCGTTGCC ACCATCACCG CCAAAGGCGC TCGCAATGTC GCCCTGCGCG ACTCCGCCGT 
CGCCGCCGTT GCCGCCGCCG CCACCGGCAG CGGCGGTACC GCCGTCACCA CCGGCACCGC 
CGGTGGCCTT GCCCGAGCCT GCCGTCGCGG TGGCACCGTC GCCGCCGGTG CCACCGGTCG 
GCGTGCCGGC AGTGCCATGG CCGCCCGTGC CGCCGTCGCC GCCGGTTTGA TCACCGATGC 
CGGACACATC TGCCGGGCTG TCCCCGGTGC TGGCCGCGGG GCCGGGCGTG GGATTGACCC 
CGTTTGCCCC GGCGAGGCCG GCGCCGCCGG TACCACCGGC GCCGCCATGG CCGAACAGCC 
CGGCGTTGCC GCCGTTACCG CCCGCACCCC CGATGCCTGC GGCCACGCTG GTGCCGCCGA 
CACCGCCGTT GCCGCCGTTG CCCCACAACC ACCCCCCGTT CCCACCGGCA CCGCCGGCCG 
CGCCGGTACC ACCGGCCCCG CCGTTGCCGC CGTTGCCGAT CAACCCGGCC GCGCCTCCGC 
TGCCGCCGGT TTGACCGAAC CCGCCAGCCG CGCCGTTGCC ACCGTTGCCA AACAGCAACC 
CGCCGGCCGC GCCAGGCTGC CCGGGTGCCG TCCCGTCGGC GCCGTTTCCG ATCAACGGGC 
GCCCCAAAAG CGCCTCGGTG GGCGCATTCA CCGCACCCAG CAGACTCCGC TCAACAGCGG 
CTTCAGTGCT GGCATACCGA CCCGCGGCCG OVGTCAACGC CTGCACAAAC TGCTCGTGAA 1680 
ACGCTGCCAC CTGTACGCTG AGCGCCTGAT ACTGC CGAGC ATGGGCCCCG AACAACCCCG 1740 
CAATCGCCGC CGACACTTCA TCGGCAGCCG CAGCCACCAC TTCCGTCGTC GGGATCGCCG 
CGGCCGCATT AGCCGCGCTC ACCTGCGAAC CAATAGTCGA TAAATCCAAA GCCGCAGTTG 
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CCAGCAGCTG CGGCGTCGCG ATCACCAAGG ACACCTCGCA CCTCCGGATA CCCCATATCG 
CCGCACCGTG TCCCCAGCGG CCACGTGACC TTTGGTCGCT GGCTGGCGGC CCTGACTATG 
GCCGCGACGG CCCTCGTTCT GATTCGCCCC GGCGCGCAGC TTGTTGCGCG AGTTGAAGAC 
GGGAGGACAG GCCGAGCTTG GTGTAGACGT GGGTCAAGTG GGAATGCACG GTCCGCGGCG 
AGATGAATAG GCGGACGCCG ATCTCCTTGT TGCTGAGTCC CTCACCGACC AGTAGAGCCA 
CCTCAAGCTC TGTCGGTGTC AACGCGCCCC AGCCACTTGT CGGGCGTTTC CGTGCACCGC 
GGCCTCGTTG CGCGTACGCG ATCGCCTCAT CGATCGATAA CGCAGTTCCT TCGGCCCAGG 
CATCGTCGAA CTCGCTGTCA CCCATGGATT TTCGAAGGGT GGCTAGCGAC GAGTTACAGC 
CCGCCTGGTA GATCCCGAAG CGGACCG 
(2) INFORMATION FOR SEQ ID NO: 2 02: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37S amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY : linear 

Ui) SEQUENCE DESCRIPTION: SEQ ID NO: 2 02: 

Gin Pro Ala Gly Ala Thr lie Ala Ala Ser Ser Pro Cys Ala Thr Val 

10 15 

Gly Ala Gly Gly Gly Thr Gly Ser Pro Val Thr Thr Glu Thr Ala Ala 

* 5 30 

Thr Thr Gly Arg Gly Gly Ser Gly Asp Val Tyr Glu Ser Ala Ala Ser 
3S 40 45 

Gly Ala Ala Ala Thr Thr Pro Thr Ala Gly Gly Tyr Thr Val Gly Pro 

55 60 
Val Ala Thr He Thr Ala Ly S Gly Ua ^ ^ Val Ala Leu ^ 

70 7 = 80 

Ser Ala Val Ala Ala Val Ala Ala Ala Ala Thr Gly Ser Gly Gly Thr 
85 90 95 

Ala Val Thr Thr Gly Thr Ala Gly Gly Leu Ala Arg Ala Cys Arg Arg 

10 0 trie j j 

Gly Gly Thr Val Ala Ma Gly Ala Thr Gly Arg Arg Ala Gly Ser Ala 
115 ^20 12S 

Met Ala Ala Arg Ala Ala Val Ala Ala Gly Leu lie Thr Asp Ala Gly 



1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2367 
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His lie Cys Arg Ala Val Pro Gly Ala Gly Arg Gly Ala Gly Arg Gly 
150 155 160 

He Asp Pro val Cys Pro Gly Glu Ala Gly Ala Ala Gly Thr Thr Gly 
165 170 175 

Ala Ala Met Ala Glu Gin Pro Gly Val Ala Ala Val Thr Ala Arg Thr 
180 185 190 

Pro Asp Ala Cys Gly His Ala Gly Ala Ala Asp Thr Ala Val Ala Ala 
195 200 205 

Val Ala Pro Gin Pro Pro Pro Val Pro Thr Gly Thr Ala Gly Arg Ala 

215 220 

Gly Thr Thr Gly Pro Ala Val Ala Ala Val Ala Asp Gin Pro Gly Arg 
25 230 235 24 = 

Ala ser Ala Ala Ala Gly Leu Thr Glu Pro Ala Ser Arg Ala Val Ala 

245 250 255 

Thr Val Ala Lys Gin Gin Pro Ala Gly Arg Ala Arg Leu Pro Gly 



Cys 



265 270 

Leu 



Arg Pro Val Gly Ala Val Ser Asp Gin Arg Ala Pro Gin Lys Arg 
275 280 285 

Gly Gly Arg lie His Arg Thr Gin Gin Thr Pro Leu Asn Ser Gly Phe 
290 295 3 00 

Ser Ala Gly He Pro Thr Arg Gly Arg Ser Gin Arg Leu His Lvs Leu 

310 315 



320 



Leu Val Lys Arg Cys Hxs Leu Tyr Ala Glu Arg Leu He Leu Pro Ser 
325 330 335 

Met Gly Pro Glu Gin Pro Arg Asn Arg Arg ^ H is Phe lie Gly Ser 
340 345 350 

Arg Ser His His Phe Arg Arg Arg Asp Arg Arg Gly Arg lie Ser Arg 
355 360 365 

Ala His Leu Arg Thr Asn Ser Arg 
370 375 

(2) INFORMATION FOR SEQ ID NO: 203: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 52 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: lxnear 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO:203; 
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GGCCAAAACG CCCCGGCGAT CGCGGCCACC GAGGCCGCCT ACGACCAGAT GTGGGCCCAG 
GACGTGGCGG CGATGTTTGG CTACCATGCC GGGGCTTCGG CGGCCGTCTC GGCGTTGACA 
CCGTTCGGCC AGGCGCTGCC GACCGTGGCG GGCGGCGGTG CGCTGGTCAG CGCGGCCGCG 
GCTCAGGTGA CCACGCGGGT CTTCCGCAAC CTGGGCTTGG CGAACGTCCG CGAGGGCAAC 
GTCCGCAACG GTAATGTCCG GAACTTCAAT CTCGGCTCGG CCAACATCGG CAACGGCAAC 
ATCGGCAGCG GCAACATCGG CAGCTCCAAC ATCGGGTTTG GCAACGTGGG TCCTGGGTTG 
ACCGCAGCGC TGAACAACAT CGGTTTCGGC AACACCGGCA GCAACAACAT CGGGTTTGGC 
AACACCGGCA GCAACAACAT CGGGTTCGGC AATACCGGAG ACGGCAACCG AGGTATCGGG 
CTCACGGGTA GCGGTTTGTT GGGGTTCGGC GGCCTGAACT CGGGCACCGG CAACATCGGT 
CTGTTCAACT CGGGCACCGG AAACGTCGGC ATCGGCAACT CGGGTACCGG GAACTGGGGC 
ATTGGCAACT CGGGCAACAG CTACAACACC GGTTTTGGCA ACTCCGGCGA CGCCAACACG 
GGCTTCTTCA ACTCCGGAAT AGCCAACACC GGCGTCGGCA ACGCCGGCAA CTACAACACC 
GGTAGCTACA ACCCGGGCAA CAGCAATACC GGCGGCTTCA ACATGGGCCA GTACAACACG 
GGCTACCTGA ACAGCGGCAA CTACAACACC GGCTTGGCAA ACTCCGGCAA TGTCAACACC 
GGCGCCTTCA TTACTGGCAA CTTCAACAAC GGCTTCTTGT GGCGCGGCGA CCACCAAGGC 
CTGATTTTCG GGAGCCCCGG CTTCTTCAAC TCGACCAGTG CGCCGTCGTC GGGATTCTTC 
AACAGCGGTG CCGGTAGCGC GTCCGGCTTC CTGAACTCCG GTGCCAACAA TTCTGGCTTC 
TTCAACTCTT CGTCGGGGGC CATCGGTAAC TCCGGCCTGG CAAACGCGGG CGTGCTGGTA 
TCGGGCGTGA TCAACTCGGG CAACACCGTA TCGGGTTTGT TCAACATGAG CCTGGTGGCC 
ATCACAACGC CGGCCTTGAT CTCGGGCTTC TTCAACACCG GAAGCAACAT GTCGGGATTT 
TTCGGTGGCC CACCGGTCTT CAATCTCGGC CTGGCAAACC GGGGCGTCGT GAACATTCTC 
GGCAACGCCA ACATCGGCAA TTACAACATT CTCGGCAGCG GAAACGTCGG TGACTTCAAC 
ATCCTTGGCA GCGGCAACCT CGGCAGCCAA AACATCTTGG GCAGCGGCAA CGTCGGCAGC 
TTCAATATCG GCAGTGGAAA CATCGGAGTA TTCAATGTCG GTTCCGGAAG CCTGGGAAAC 
TACAACATCG GATCCGGAAA CCTCGGGATC TACAACATCG GTTTTGGAAA CGTCGGCGAC 
TACAACGTCG GCTTCGGGAA CGCGGGCGAC TTCAACCAAG GCTTTGCCAA CACCGGCAAC 
AACAACATCG GGTTCGCCAA CACCGGCAAC AACAACATCG GCATCGGGCT GTCCGGCGAC 
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AACCAGCAGG GCTTCAATAT TGCTAGCGGC TGGAACTCGG GCACCGGCAA CAGCGGCCTG 1680 

TTCAATTCGG GCACCAATAA CGTTGGCATC TTCAACGCGG GCACCGGAAA CGTCGGCATC 1740 

GCAAACTCGG GCACCGGGAA CTGGGGTATC GGGAACCCGG GTACCGACAA TACCGGCATC 1800 

CTCAATGCTG GCAGCTACAA CACGGGCATC CTCAACGCCG GCGACTTCAA CACGGGCTTC i860 

TACAACACGG GCAGCTACAA CACCGGCGGC TTCAACGTCG GTAACACCAA CACCGGCAAC 1920 

TTCAACGTGG GTGACACCAA TACCGGCAGC TATAACCCGG GTGACACCAA CACCGGCTTC 1980 

TTCAATCCCG GCAACGTCAA TACCGGCGCT TTCGACACGG GCGACTTCAA CAATGGCTTC 2040 

TTGGTGGCGG GCGATAACCA GGGCCAGATT GCCATCGATC TCTCGGTCAC CACTCCATTC 2100 

ATCCCCATAA ACGAGCAGAT GGTCATTGAC GTACACAACG TAATGACCTT CGGCGGCAAC 2160 

ATGATCACGG TCACCGAGGC CTCGACCGTT TTCCCCCAAA CCTTCTATCT GAGCGGTTTG 2220 

TTCTTCTTCG GCCCGGTCAA TCTCAGCGCA TCCACGCTGA CCGTTCCGAC GATCACCCTC 2280 

ACCATCGGCG GACCGACGGT GACCGTCCCC ATCAGCATTG TCGGTGCTCT GGAGAGCCGC 234 0 

ACGATTACCT TCCTCAAGAT CGATCCGGCG CCGGGCATCG GAAATTCGAC CACCAACCCC 2400 

TCGTCCGGCT TCTTCAACTC GGGCACCGGT GGCACATCTG GCTTCCAAAA CGTCGGCGGC 2460 

GGCAGTTCAG GCGTCTGGAA CAGTGGTTTG AGCAGCGCGA TAGGGAATTC GGGTTTCCAG 2520 

AACCTCGGCT CGCTGCAGTC AGGCTGGGCG AACCTGGGCA ACTCCGTATC GGGCTTTTTC 2580 

AACACCAGTA CGGTGAACCT CTCCACGCCG GCCAATGTCT CGGGCCTGAA CAACATCSGC 264 0 

ACCAACCTGT CCGGCGTGTT CCGCGGTCCG ACCGGGACGA TTTTCAACGC GGGCCTTGCC 2700 

AACCTGGGCC AGTTGAACAT CGGCAGCGCC TCGTGCCGAA TTCGGCACGA GTTAGATACG 2760 

GTTTCAACAA TCATATCCGC GTrTTGCGGC AGTGCATCAG ACGAATCGAA CCCGGGAAGC 282 0 

GTAAGCGAAT AAACCGAATG GCGGCCTGTC AT 2852 
(2) INFORMATION FOR SEQ ID NO: 204: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 943 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:204: 

Gly Gin Asn Ala Pro Ala He Ala Ala Thr Glu Ala Ala Tyr Aso Gin 
1 5 10 is" 
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Met Trp Ala Gin Asp Val Ala Ala Met Phe Gly Tyr His Ala Gly Ala 

25 30 

Ser Ala Ala Val Ser Ala Leu Thr Pro Phe Gly Gin Ala Leu Pro Thr 

40 45 

Val Ala Gly Gly Gly Ala Leu Val Ser Ala Ala Ala Ala Gin Val Thr 

55 60 
Thr Arg Val Phe Arg Asn Leu Gly Leu Ala Asn Val Arg Glu Gly Asn 

70 7 * 80 

Val Arg Asn Gly Asn Val Arg Asn Phe Asn Leu Gly Ser Ala Asn lie 
85 *° 95 

Sly Asn Gly Asn He Gly Ser Gly Asn He Gly Ser Ser Asn He Glv 
100 1Q 5 110 

Phe Gly Asn Val Gly Pro Gly Leu Thr Ala Ala Leu Asn Asn He Gly 

120 125 
Phe Gly Asn Thr Gly ser Asn ftsn ne Qly ^ ^ 

135 140 
Asn Asn He Gly Phe Gly Asn Thr Gly Asp Gly Asn Arg Gly He Gly 

1S ° "5 160 

Leu Thr Gly ser Gly Leu Leu Gly Phe Gly Gly Leu Asn Ser Gly Thr 

loo 



170 



175 



Gly Asn He Gly Le U Phe Asn Ser Gly Thr Gly 



180 



185 



Asn Val Gly He Gly 
190 



Asn Ser Gly Thr Gly Asn Trp Gly I 



195 



200 



le Gly Asn Ser Gly Asn Ser Tyr 



205 



Asn Thr Gly Phe Gly Asn Ser Gly Asp Ala Asn Thr Gly Phe Phe Asn 

215 220 
Ser Gly H e Ala Asn Thr Gly Val Gly Asn Ala Gly Asn Tyr Asn Thr 

230 235 2 40 

Gly Ser Tyr Asn Pro Gly Asn Ser Asn Thr Gly Gly Phe Asn Met Gly 
245 250 2S5 

Gin Tyr Asn Thr Gly Tyr Leu Asn Ser Gly Asn Tyr Asn Thr Gly Leu 

265 270 
Ala Asn ser Gly Asn Val Asn .Thr Gly Ala Phe lie Thr Gly Asn Phe 



280 

Asn Asn Gly Phe Leu Trp Arg Gly Asp His Gin 

290 295 



285 

Gly Leu He Phe Gly 
300 



Ser Pro Gly Phe Phe Asn 



Ser Thr Ser Ala Pro Se 



r Ser Gly Phe Phe 
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305 



310 315 



320 



Asn Ser Gly Ala Gly Ser Ala Ser Gly Phe Leu Asn Ser Gly Ala Asn 
325 "0 335 

Asn Ser Gly Phe Phe Asn Ser Ser Ser Gly Ala He Gly Asn Ser Gly 
340 345 350 

Leu Ala Asn Ala Gly Val Leu Val Ser Gly Val lie Asn Ser Gly Asn 
355 360 3 65 

Thr Val ser Gly Leu Phe Asn Met Ser Leu Val Ala He Thr Thr Pro 

375 380 

Ala Leu He Ser Gly Phe Phe Asn Thr Gly Ser Asn Met Ser Gly Phe 
390 395 400 

Phe Gly Gly Pro Pro Val Phe Asn Leu Gly Leu Ala Asn Arg G1-, Val 
405 4 10 415 

Val Asn lie Leu Gly Asn Ala Asn He Gly Asn Tyr Asn lie Leu Gly 
420 425 430 

Ser Gly Asn Val Gly Asp Phe Asn He Leu Gly Ser Gly Asn Leu Gly 
435 ««0 445 



Ser Gin Asn He Leu Gly Ser Gly Asn Val Gly Ser Phe Asn He Gly 

455 460 

Ser Gly Asn He Gly Val Phe Asn Val Gly Ser Gly Ser Leu Gly Asn 
470 475 480 

Tyr Asn He Gly Ser Gly Asn Leu Gly He Tyr Asn He Gly Phe Gly 
485 " 495 

Asn val Gly Asp ryr Asn Val Gly Phe Gly Asn Ala Gly Asp Phe Asn 
500 505 510 

Gin Gly Phe Ala Asn Thr Gly Asn Asn Asn He Gly Phe Ala Asn Thr 
515 "0 525 

Sly Asn Asn Asn He Gly He Gly Leu Ser Gly Asp Asn Gin Gin Gly 

535 540 

Phe Asn He Ala Ser Gly Trp Asn Ser Gly Thr Gly Asn Ser Gly Leu 
550 55S S60 

Phe Asn Ser Gly Thr Asn Asn Val Gly He Phe Asn Ala Gly Thr Gly 
5 ° 5 570 S7S 

Asn Val Gly lie Ala Asn Ser Gly Thr Gly Asn Trp Gly He Gly Asn 

585 590 

Pro Gly Thr Asp Asn Thr Gly He Leu Asn Ala Gly Ser Tyr Asn Thr 



595 soo S05 
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Gly He Leu Asn Ala Gly Asp Phe Asn Thr Gly Phe Tyr Asn Thr Gly 
610 615 620 

Ser Tyr Asn Thr Gly Gly Phe Asn Val Gly Asn Thr Asn Thr Gly Asn 
625 630 6 35 64Q 

Phe Asn Val Gly Asp Thr Asn Thr Gly Ser Tyr Asn Pro Gly Asp Thr 
645 650 655 

Asn Thr Gly Phe Phe Asn Pro Gly Asn Val Asn Thr Gly Ala Phe Asp 
660 665 670 

Thr Gly Asp Phe Asn Asn Gly Phe Leu Val Ala Gly Asp Asn Gin Gly 
675 680 685 

Gin lie Ala He Asp Leu Ser Val Thr Thr Pro Phe He Pro He Asn 
690 695 700 

Glu Gin Met Val He Asp Val His Asn Val Met Thr Phe Gly Gly Asn 
705 7 10 715 720 

Met He Thr Val Thr Glu Ala Ser Thr Val Phe Pro Gin Thr Phe Tyr 
725 730 735 

Leu Ser Gly Leu Phe Phe Phe Gly Pro Val Asn Leu Ser Ala Ser Thr 
740 745 750 

Leu Thr Val Pro Thr He Thr Leu Thr He Gly Gly Pro Thr Val Thr 
755 760 765 

Val Pro He Ser He Val Gly Ala Leu Glu Ser Arg Thr He Thr Phe 
770 775 780 

Leu Lys He Asp Pro Ala Pro Gly He Gly Asn Ser Thr Thr Asn Pro 
785 7 *0 795 800 

Ser Ser Gly Phe Phe Asn Ser Gly Thr Gly Gly Thr Ser Gly Phe Gin 
805 810 815 

Asn Val Gly Gly Gly Ser Ser Gly Val Trp Asn Ser Gly Leu Ser Ser 
820 825 830 

Ala He Gly Asn Ser Gly Phe Gin Asn Leu Gly Ser Leu Gin Ser Gly 
835 g 45 

Trp Ala Asn Leu Gly Asn Ser Val Ser Gly Phe Phe Asn Thr Ser Thr 
850 855 860 

Val Asn Leu Ser Thr Pro Ala Asn Val Ser Gly Leu Asn Asn He Gly 
865 87 ° 875 880 

Thr Asn Leu Ser Gly Val Phe Arg Gly Pro Thr Gly Thr He Phe Asn 
885 890 895 
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Ala Gly Leu Ala Asn Leu Gly Gin Leu Asn lie Gly Ser Ala Ser Cys 
900 90S 910 

Arg lie Arg His Glu Leu Asp Thr Val Ser Thr Xle He Ser Ala Phe 
915 920 925 

Cys Gly ser Ala Ser Asp Glu Ser Asn Pro Gly Ser Val Ser Glu 



935 

(2) INFORMATION FOR SEQ ID NO: 205: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:205: 
GGATCCATAT GGGCCATCAT CATCATCATC ACGTGATCGA CATCATCGGG ACC S3 
(2) INFORMATION FOR SEQ ID NO: 206: 

* ( i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:206: 

CCTGAATTCA GGCCTC3GTT GCGCCGGCCT CATCTTGAAC GA 42 

(2) INFORMATION FOR SEQ ID NO: 2 07: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 31 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO:207: 
GGATCCTGCA GGCTCGAAAC CACCGAGCGG T 
(2) INFORMATION FOR SEQ ID NO: 2 08: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 31 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 208: 



31 
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CTCTGAATTC AGCGCTGGAA ATCGTCGCGA T 
(2) INFORMATION FOR SEQ ID NO: 209: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 209: 
GGATCCAGCG CTGAGATGAA GACCGATGCC GCT 
(2) INFORMATION FOR SEQ ID NO: 210: 

Ci) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 38 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 210: 
GGATATCTGC AGAATTCAGG TTTAAAGCCC ATTTGCGA 
(2) INFORMATION FOR SEQ ID NO: 211: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base oairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 211: 
CCGCATGCGA GCCACGTGCC CACAACGGCC 
(2) INFORMATION FOR SEQ ID NO: 2 12: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 37 base Dairs 

(B) TYPE: nucleic acid 
CO STRANDEDNESS: single 
£D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 212: 
CTTCATGGAA TTCTCAGGCC GGTAAGGTCC^ GCTGCGG 
(2) INFORMATION FOR SEQ ID N0:213: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 7675 base oairs 

(B) TYPE: nucleic acid" 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

t*i) SEOUESO: OESOSIPTIOM: SEO W W> :51 3, 

»»«. gacgcgccct gtaocggcgc attaagcgcg 

CAGCCTGACC GCTACACTTC CCAGCGCCCT AGCGCCCGCT ^ 
C^CC ACCTTCCO* TCAAGCTCTA AATCGGGGGC TCCCTTTAGG 

'Cccaaaaaa «.« 

ACGTAGTGGG CCATCGCCCT GATAGACGGT 
CTTTAATAGT GGAC^TGT JiCMcAm 

Tm-™ taagggattt roccmimc 

ACAAAA,^ AACGCGAATT TTAACAAflAT ATTAACGTTT ACAATTTCAG G^GCA^T 
TCGGGGAAA T GTGCGCGGAA CCCCTATTTG TTTATTTTTC TAAATACATT CAAATATCTA 
TCCGCTCATG AATTAa™ TAGAAAAACT CATCGAGCAr CAAATCAAAC TGCAATTTAT 
TCATATCAGG ATTATCAATA CCATATTTTT GAAAAAGCCG TTTCTGTAAT OAAGGAGAAA 
ACTCACCGAG GCAG^CA. AGGATGGCAA GATCCTGGTA TCGGTCTGCG ATTCCGACTC 
GTCCAACATC AATACAACCT ATTAATTTCC CCTCGTCAAA AATAAGGTTA TCAAGTGAGA 
AATCACCATG AGTGACGACT GAATCCGGTG AGAATGGCAA AAGTTTATGC ATTTC^- 
AGACTTGTTC AACAGGCCAG CCA^ACGCT CGTCATCAAA ATCACTC3CA TCAACCAAAC 

cgttattcat 3CCTMGCM GcMicGm 

AATTACAAAC AGGAATCGAA XGCAACGGGC GCAGGAACAC TGCCAGCGCA TCAACAATAT 
— ACC™ ATCAGGATAT TOTTO a T a CCTX ^ c ^ 
TGGTGAGTAA CCATGCATCA TCAGGAGTAC GGATAAAATG CTTGATGGTC GGAAGAGGCA 
TAAATTCCST CAGCCAGTTT AGTCTGACCA TCTCATCTGT AACATCATTG GCAACGCTAC 
CTTTGCCATG ^CAGAAAC AACTCTGGCG CATCGGGCTT CCCATACAAT CGATAGATTG 
TCGCACCTGA TTGCCCGACA TTATCGCGAG CCCATTTATA CCCATATAAA TCAGCATCCA 
TGTTGGAATT TAATCGCGGC CTAGAGCAAG ACGTTTCCCG TTI3AATATGG GTCATAACAC 
CC^AXT AC^A.G TAAGCAGACA GTTTTATTGT TCATGACCAA AATCCCTTAA 
CGTGAGTTTT CG^CACTG AGCGTCAGAC CCCGTAGAAA AGATCAAAGG ATCTTCTTGA 
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GATCCTTTTT TTCTGCGCGT ARTCTrrmn ^ 

MTCT<3<:TC c tocaaacaa aaaaaccacc gctaccagcg 

STSGTTTOTr TGCCGGATCA Asat-rr.™.. 

"»=CTACCJ ACTCTTTTTC CGAAGGTAAC TGGCTTCAGC 

*»gcgcaga ^caaatac isicctot raiMcror 

"™ " CC0CC ™ ™— GTGCTAATCC TGTTACCAGT GGCTGC^c 
AGTGGCGATA AGXCG^ 

°— ™GC ACACAGCCCA GC^ AACGACCTAC 
-SMOGA — .GAGAA^ 

AAGGCGGACA GG^CCGG, AAGCGGCAGG G.GGAACAG 
CCAGGGGGAA ACGCC^A «„. CCOTC0OT 

«n»m, tgtgatgctc g^agggggg awMeM 
*™ c ™^ c^ggtgg raroT?cTT ictocotj 

™„ c^aa cmmKQ xcm ^ maam! 

-CGAACGA CCGAGCGCAG GGAG^G AGCGAGGAAC CGGAAGAGCG CCXOA^ 
«™« TTACGCATCT GTGCGG^ TCACACCGCA „G C ACTCTCAGTA 
TGATCCCGCA TAGTTAAGCC AGTATACACT CCGC^COG TACGTGACTG 
GGTCATGGCT GCGCCCCGAC ACcrw-,.,. . 

SC °" JI « »»CHaC GCGCCCTGAC GGGCTTGTCT 
GCTCCCGGCA .CGC^ACA GACAAGCTGT GACC^CC GGGAGCTGCA „AG 
<™=3 TCATCACCGA AACGCGCGAG GCAGCTGCGG TAAAGCTCAT CAGCGTGGTC 
GTGAAGCGAT TCACAGATGT CTGCCTGTTC AXCCGCGTCC AGCTCGTTGA GTTTCTCCAG 
AAGCGTTAAT GTC^C TGATAAAGCG GGCCATGTTA AGGGCGGTTT rr^^r 

«™ „ TwcMrou 

ACGAGAGAGG ATGCTCACGA .CGGG^C CATGCCCGGT TACTGGAACG 

™~ AAACAACTGG CGGTATGGAT GCGGCGGGAC CAGAGAAAAA TCACTCAGGG 
TCAATGCCAG CGC^A ATACAGATGT CAGGGTAGCC AGCAGCATCC 

TGCGATGCAG ATCCGGAACA TAATGGTGCA GGOCGCXGAC TTCCGCGTTT CCAGAC^A 

CGAAACACGG AAACCGAAGA CCA^ 

GCAGTCGCTT CACGTTCGCT CGCGTATCGC T r»^, 

CGTATCGG TGATTCATTC TGCTAACCAG TAAGGCAACC 
CCGCCAGCCT AGCCGGGTCC tcaacip.^ „ 

-CAAC^CAG GAGCACGATC ATGCGCACCC GTGGGGCCGC 



1560 
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CATGCCGGCG 
GGCTTGAGCG 
GCTCCAGCGA 
GAGTTGCATG 
CCGGAAGGAG 
ATGAGTGAGC 
CCTGTCGTGC 
TGGGCGCCAG 
CCGCCTGGCC 
AATCCTGTTT 
ATCCCACTAC 
CGCCCAGCGC 
GCATTTGCAT 
TCGGCTGAAT 
AGACAGAACT 
GCTCCACGCC 
GGTCAGAGAC 
CATCCTGGTC 
TGTGCACCGC 
TGGCACCCAG 
GGGCCAGACT 
CCACGCGGTT 
TCGCAGAAAC 
CATACTCTGC 
CTTCCGGGC3 
TCTCGACGCT 
CC3TTGAGCA 



ATAATGGCCT 
AGGGCGTGCA 
AAGCGGTCCT 
ATAAAGAAGA 
CTGACTGGGT 
TAACTTACAT 
CAGCTGCATT 
GGTGGTTTTT 
CTGAGAGAGT 
GATGGTGGTT 
CGAGATATCC 
CATCTGATCG 
GGTTTGTTGA 
TTGATTGCGA 
TAATGGGCCC 
CAGTCGCGTA 
ATCAAGAAAT 
ATCCAGCGGA 
CGCTTTACAG 
^TGATCSGCG 
GGAGGTGGCA 
GGGAATGTAA 
GTGGCTGGCC 
GACATCGTAT 
CTATCATGCC 
CTCCTCTTATG 
CCGCCGCCGC 



GCTTCTCGCC GAAACGTTTG GTGGCGGGAC CAGTGACGAA 
AGATTCCGAA TACCGCAAGC GACAGGCCGA TCATCGTCGC 
CGCCGAAAAT GACCCAGAGC GCTGCCGGCA CCTGTCCTAC 
CAGTCATAAG TGCGGCGACG ATAGTCATGC CCCGCGCCCA 
TGAAGGCTCT CAAGGGCATC GGTCGAGATC CCGGTGCCTA 
TAATTGCGTT GCGCTCACTG CCCGCITTCC AGTCGGGAAA 
AATGAATCGG CCAACGCGCG GGGAGAGGCG GTTTGCGTAT 
CTTTTCACCA GTGAGACGGG CAACAGCTGA TTGCCCTTCA 
TGCAGCAAGC GGTCCACGC7 GGTT7GCCCC AGCAGGC3AA 
AACGGCGGGA TATAACATGA GCTGTCTTCG GTATCGTCGT 
GCACCAACGC GCAGCCCGGA CTCGGTAATG GCGCGCATTG 
TTGGCAACCA GCATCGCAGT GGGAACGATG CGCTCATTCA 
AAACCGGACA TGGCACTCCA GTCGCCTTCC CGTTCCGCTA 
GTGAGATATT TATGCCAGCC AGCCAGACGC AGACGCGCCG 
GCTAACAGCG CGATTTGCTG GTGACCCAAT GCGACCAGAT 
CCGTCTTCAT GGGAGAAAAT AATACTGTTG ATGGGTCTC7 
AACGCCGGAA CATTAGTGCA GGCAGCTTCC ACAGCAATGG 
TAGTTAATGA TCAGCCCAC7 GACGCGTTGC GCGAGAAGAT 
GCTTCGACGC CGCTTCGTrc TACCATCGAC ACCACCACGC 
CGAGATTTAA TCGCCGCGAC AATTTGCGAC GGCGCGTGCA 
ACGCCAATCA GCAACGACTG TTTGCCCGCC AGTTGTTGTG 
TTCAGCTCGG CCATCGCCGC TTCCACTTTT TCCCGCGTTT 
TGGTTCACCA CGCGGGAAAC GGTCTGATAA GAGACACCGG 
AACGTTACTG GTTTCACATT CACCACCCTG AATTGACTCT 
ATACCGC3AA AGGTTTTGCG CCATTCGATG GTGTCCGGGA 
CGACTCCTGC ATTAGGAAGC AGCCCAGTAG TAGGTTGAGG 
AAGGAATGGT GCATGCAAGG AGAXGGCGCC CAACAGTCCC 
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aKaaaae CCCGAAGTGG 
««» CTTCCCCATC GG^ 

CCGCCGG^A TGCCGGCCAC GAX^G GCGTAGAGGA TCGAGATCTC GATCCCGCGA 

«™«agg «x» ggggaa™ „ rM 

^ CAGATATACA M TGGG=CAT ^ ATc>OTfflT 

CGACATCATC GGGACCAGCC CCACATCCTG orim~ 

TCCTC ^CAGGCG GCGGCGGAGG CGGTCCAGCG 
«»« ACCG T CGA K ACA7CCGGG, ATTGAGCAGG AO„ 

««« GGCAAGATGA COTCCK „ 

—GCG AGGGGCTCGA AACCACCGAG CGG^GCC, GAAACGGGCG CCGGCGCCGG 
-™ ACTACCCCCG CGTCGTCGCC GGTGACGTTG GCGGAGACCG GTAGCACGC" 
PCTCTACCCG C^CAACC „< c 

GATCACCGCT CAGGGCACCG G^GC** GGGGA.CGCG CAGGCCGCCG CCGGCACGO, 
-CA^ GGC.CCGACG CGTAXCXGXG GGAAGGTGAT ATCCCCGCGC AC*^ 
GATGAACATC GCGCTAGCCA XCCCGCGA GCAGGTCAAC TACAACCTGC CCGGAGTGAG 
CGAGCACCTC AAGCTGAACG GAAAAGTCCT GGCGGCCA TO TACCAGGGCA CCATCAAAAC 
-GGACGAC CCGCAGATCG GTGCGGTCAA CGCCGGC™ AACCTGCCCG GCACCGCGGT 

CACCGCCGG ACGGGTCCGG ^CACC^ 
-GGAAGA, CGCGAGGGC GGGGCAAGTC CGCCCG™ ggGAGCACGG TCGACTTCCC 
-CGGTGCCG GXGAGAAGGG CAACGGCGGC ATGGTGACCG G^CGCGg. 

~cac=g=gc .gcgtggcc ATATCGGCAT cag CTCOT GACCAGGGCA GTCAACGGGG 

ACTCGGCGAG GCCCAACTAG GCAATAGCTC TGrra™ 

UAATAGCTC TGGCAATTTC TTGTTGCCCG ACGCGCAAAG 

GATTCAGGCC GCGGCGGCTG GG^ GAAAACCCGG GCGAACCAGG CGATTTCGAT 
QATCGACGGG CCCGGCCCGG AGGGG.ACCG GATCATCAAC TACGAGTACG CCA^ 
GAACCGGCAA AAGGACGCCG CCACCGCGCA GACCTTGCAG GGA^CXGC ACTGGGCGAT 

CACCGACGGC AACAAGGCCT CGTTCCTCGA rra™, 

™CCTCGA _CCAGGTTCAT TTCCAGCCGC TGCCGCCCGC 
GGTGGTGAAG ^ ^ 

-™ CTCGCGCAGG AGGGAGG.AA ^gACGCG ATCTCCGGCG AGCGAAAAC 
GCAGA.CGAG CAGGTGGAGT CGACGGCAGG ^GCAG GGCCAGTGGC GCGGCGCCGC 
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gggcacgccc gcccacgccg ccg^ cras 

—CGAC GAGA^CGA TOCC GTCCA1I1CI 0>TO 

CGAGGAGCAG CAGCAGGCGC TGTCCTCRra 

TGTCCTCGCA AATGGGCTTT GTGCCCACAA CGGCCGCCTC 

<=C™ ACCGCGCAG CGcacOTc ACCGGCGACA CCXC^CCC CCCCCC.CC 

cccccccccc atgcccagcc mcgcmcjc 

CTO,iC « ™» — gcaccc=»c c^ccgga* 

-CAACCCG G^CAGGA* TCAGCTTCGC GCTGCCTGCT COC^ .CCTOACCC 

cgccca^c gacaccg^ ^ accs(k=acc 
cggacacccc ccgccggxgg ccaatgacac cc„ «ggccgcc 

«™«= AGCGCCGAAG CCACCSACXC CAAGGCCGCG GCCCGC^C CCCGGACA- 
GGGTGAGTTC TATATGCCCT ACCCGGGCAC CCGGAXCAAC CAGGAAACCG ^GC^A 
CGCCAACGGG GTGTCTGGAA GCGCGTCGTA TTACGAAGTC AAGTTCAGCG ATCCGAGTAA 
GCCGAACGGC CAGATCTGGA CGGGCGTAAT CGGCTCGCCC GCGGCGAACG CACCGGACGC 
C-CCCCCX CAGCGCTGGT GCCCGCACC CCCAACAACC CCGTCGACAA 

-CGCGGCC AAGGCGCTGG CCGAATCGAT CCCCC^ GTCGCCCCGC CGCCGGCGCC 
-CACCGGC CC^CACAGC CCGCTCCGGC GCCGGC3CCG GCCGGGGAAG .CGCTCCAC 
CCCGACGACA CCGACACCGC AGCGGAC CT ACCGGCCGA GAAT^CA GATATCCATC 

™gg ccgccgagc accaccacca ccaccactga gatccggctg caacaaagc 

CCGAAAGGAA GCTGAGTTGG CTGC~rr~ar 

CTGC - GC - A C CGCTGAGCAA TAACTAGCAT AACCCCTTGG 

GGCCTCTAAA CGGGTCTTGA GGGGTTTTTT GC-rm^, 

i.TTTT GCTGAAAGGA GGAACTATAT CCGGAT 

(2) INFORMATION FOR SEQ I D NO: 2l4: 

(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 8 02 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:2l4: 
Mee Gly His His His His His Hi, v a1 T i 

, s Hxs H 1S Val u e He n e Gly Thr Ser 

10 15 

P« Thr ser Trp Glu Gin Ala Ala Ala Glu Ala v.i Gin a 

A±a Va. Gin Arg Ala Arg 
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20 25 

30 



** S " » " P - ~ «• - ~ n. «- «. 

40 45 

T ** S " "* 2~ «• Thr ^ ^ Ile Lys Leu =lu 

b5 60 

Jjr Ph* Lys „« ..jj,^ Q1 „ pro ^ My ^ 

75 

«T ~ Pro cl u j, 31y Ala « y a . Gly r**!***,^ ,1 



«. ~r *.r Pro « T*r M . 01 „ Jhl 01y set ^ ^ ^ 

i -t n 



110 



^ _ ^ Trp ar ^ ^ 

"2° 125 
V! * ,U T*r U , ^ Thr cly s=r My ^ y iu 

^ 140 

JJ. Ala Al, „ y IBr _ Ue =ly M> ^ ^ ^ 

155 

i« «v ^ Met ^ ^ His Lys Gly Leu ^ ne ^ l ^ ™ 



165 



He Ser Ala Gin Gin Val Asn Tyr Asn Leu Pro 



180 71 ASn Leu Pro G1 y Va * Ser Glu His 



185 190 



«-» ^ U« Asn Giy Lys val Ala ^ ^ ^ g ^ t ^ 

200 205 



Lys Thr Trp Asp Asd d-o Gin n a ,i 

210 P ^° Ile ^ a ^* Asn Pro Gly Val Asa 

220 

J- >ro « y ^ . u , V11 p „ ^ ^ ^ ^ ^ 

-P Thr P„ e P „ TSt „ ^ L . u ^ up pm ™ 

250 255 
Trp «, ly s £, Pro 01. Ph . Thr „ VI1 ^ phe ^ ^ 

265 270 

Pro S l y ^ ^ Gly Clu ^ ^ ^ ^ ^ ^ ^ 

,280 285 

Clu fc Pro „ y cys « All ^ ne sly n . ^ 

^ 5 300 

«J «• S .r ax. oi y ^ u aly slu „, „ ^ siy 



315 320 
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«r ». jj. Pro „ Ma Gln ^ iie ^ ^ ^ ^ ^ 

«r - S ^ ^ Pro ua ^ U1 tle ser fct 

345 350 

" pr ° s p ™ * s "* lle to ** «« ^ «• «. 

~ ^ a, p ua ua fc ua ^ ^ ^ ola ua 

-J - H is rr, ». Tto ^ =ly ^ ^ 2" s « pta ^ ^ 

395 

3 400 

- ~ ». P>. ^ ?ro Leu pr0 ^ ^ w ^ ^ ^ ^ 

410 415 

- L- U. „, „ n , s „ ^ ^ Mu ^ ^ ^ ^ ^ 

* ^ jo. 01n 01 „ 51y phe giu ^ iu ^ ^ „ leu 

445 

Lys Thr Gin ne ^ Gln 

4S0 ; 11111 Ser Thr Ala Gly Ser Leu Gin Gly 

" 460 

«J T, *» 01y Ala AU Sly T6r u , M , Ua ua ^ 

475 

3 480 

«- «. «. «. jj. ^ ^ „ L/s „ slu Leu ^ aiu ue ser 

% *" " e S «* ~ £ - Arg Ala Asp »„ 01 „ 

05 510 
«• «. «J Ala ta te s „ ph . ^ ^ ^ 

^ 525 
Ala Ser Pro Pro Se*- T-hr ai a a1 „, 

530 ^ JJ* Wa W Pro Ala Pro Ala Thr Pro 

3 540 
Val Ala Pro Pro Pro Pro Ala ju »! . 

5« 550 ^ a ASn Thr Pro Gin Pro 

555 

«y -p P „ ^ „, Ua Pro ^ pro ^ ^ ^ ^ ^ 2 

*. - val XI. Ala to „, J p „ ^ ue ^ 4sn 

58S 590 
Pro Val Gly Gly Phe Ser Phe m, t« „ 

5 95 f Q a Leu Ala Gly Trp Val Glu Ser 

605 
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- *. «. ». „ s Gly s „ ui Leu ^ s<r ^ ^ 

620 

s », *. p.. « Gly Gln pro Pto ,„ v>i ^ ^ ^ ^ 

635 640 
- v., u. ^ „ L . u „ G1 „ ^ ^ ^ ^ ^ ^ ^ 

^ 185 S ^ "* "* S «~ «r - ^ «= 1" «. 

S65 670 
«- «* H.t ,„ ^ ^ Sly ^ ue ^ ^ ^ ^ 

680 6B5 

- JJP «. ^ ^ vu s Gly s „ Ma _ ^ ^ ^ 

700 

£ 3« „ P „ s . r pr<> ^ My ne ^ ^ 

715 

«y ~ Pro wa £ ta u , p „ ^ ^ ^ ^ ^ ™ 

- - « £ U. »„ ^ Ma ^ ^ ^ ^ gj y 

£ *- M ° - "« •» I. « ^ Z ,ro 

760 765 

«. «j «. PC0 ua 5ro Glu pro Ma pro ^ ^ ^ ^ ^ 

civ «. v. ^ ? „ ^ ? „ Thc ^ _ ^ ™ ^ ^ ^ ^ 



800 

Pro Ala 

(2) INFORMATION FOR SEQ ID NO: 215: 

(i) SEQUENCE CHARACTERISTICS- 
(A) LENGTH: 454 base aairs ' 
nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

Ui) MOLECULE TYPE: Genomic DNA 

<X1) SE <H*"CE DESCRIPTION. SEQ ID NO:2 15: 

tcgacStga SSSS SSSS S^ TCC GTTCGCGAAC «™*»cgg 60 

GCGGTCCGCG CGGTGACGAA GCTGaScS TCAGGAAGCC GTCCAGCAGC 120 

CCGACCAATG TCGACCGGCT GATCCGCcS IcS^rnl, GCAGCACCCC GGCGATGGCG 180 
.CACCCAGCA GGGCGCCGGT GAACCGcS ££££ 2£££ >« 



300 
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GAAGATGGGG GTGCCGGCAT CcSSccS ^ ^ GCTTCTCGTC J" 

(2) INFORMATION FOR SEQ ID i» !2IS: ^ 

(i) SEQUENCE CHARACTERISTICS • 
A) LENGTH : 470 base pairs 
T^E: nucleic acid 



aciu 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) 
(xi) 

TGCAGAAGTA 



MOLECULE TYPE: Genomic DNA 
SEQUENCE DESCRIPTION: SEQID N0 : 216: 

GCAtStS ScSgSq SS^Stg JS CGAACG GTCGCC3AAC 
ATACCACCGA CGACCTGCTG St1SS£c iSS*** CGTCGTC ^ GCCATGGGGG 
AGCTGGACAT GCTGCTTACC GCcSS^r ^ CAGGTGTG CCCGGCGCCG CCGCCTCGGG^ 
TCGAGTCGCT CGGCGCGCAT ScSS^ TGCGTTGGTG GCCATGgS 

ssss ss2S£ si : 2ss 2ss sssss ;s 

atgtcacgac gttgggccgc S5SS 2S«S SSSSS ™ S 420 

2) INFORMATION FOR SEQ ID NO:2 i 7: 

(i lnf EQOENCS CHARACTERISTICS- 
A) LENGTH: 279 base pairs' 

TYPE: nucleic acid 
(C) STRANDEDNESS : single 



— — — — • axij 

TOPOLOGY: Iznear 

(ii) MOLECULE TYPE: Genomic 

(xi) 



DMA 



SEOT3.CS OESCKPn™, SEQ ID SO: 21 7, 



— oc sees sss? 

(2) INFORMATION FOR SEQ ID NO: 218: 

fi) | SEQUENCE CHARACTERISTICS- 
A) LENGTH: 219 base pairs 
WW: nucleic acid 

(C) STRANDEDNESS: sinele 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(*i> SEQUENCE DESCRIPTION: SEQ ID NO : 218; 



60 
120 
180 
240 
300 



GGCCGGCGTA CCCGGC^GGG Aran a 

TCGTGGCCCT TCCCCAGTTG AcSISiS ATCGATTGA T ATCGATGAGA GACGGAGGAA 

CCGCACGTCG AGCGCGaS gSSc^S iE?*"** CGCGTT ^ AAGgSSJ x £ 

CCCAGGTCCT CAAGGACGCG iSS"* ^GTGGCGGC ACCAACCtS ^ 

tgcttgaggc cttgccaaag g^SJS eioxcncoc 



279 
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ACACGGTCGA ACTCGACGAG CCCCTCGTGG AGGTGTCGAC CGACAAGGTC GACACCGAAA 
TCCCTCGCCG GCCGCGGGTG TGCTGACCAA GATCATCGCC CAAGAAGATG ACACGGTCGA 
GGTCGGCGGC GAGCTCTCTG TCATTGGCGA CGCCCATGAT GCCGGCGAGG CCGCGGTCCC 
GGCACCCCAG AAAGTCTCTG CCGGCCCAAC CCGAATCCA 

(2) INFORMATION FOR SEQ ID NO: 219: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 342 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 219: 



TCGCTGCCGA CATCGGCGCC GCGCCCGCCC CCAAGCCCGC ACCCAAGCCC GTCCCCGAGC 
CAGCGCCGAC GCCGAAGGCC GAACCCGCAC CATCGCCGCC GGCGGCCCAG CCAGCCGGTG 
CGGCCGAGGG CGCACCGTAC GTGACGCCGC TGGTGCGAAA GCTGGCGTCG GAAAACAACA 
TCGACCTCGC CGGGGTGACC GGCACCGGAG TGGGTGGTCG CATCCGCAAA CAGGATGTGC 
TGGCCGCGGC TGAACAAAAG AAGCGGGCGA AAGCACCGGC GCCGGCCGCC CAGGCCGCCG 
CCGCGCCGGC CCCGAAAGCG CCGCCTGAAG ATCCGATGCC GC 

(2) INFORMATION FOR SEQ ID NO: 220: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 515 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:220: 



GGGTCTTGGT CAGTATCAGC GCCGACGAGG ACGCCACGGT GCCCGTCGGC GGCGAGTTGG 60 

CCCGGATCGG TGTCGCTGCC GACATCGGCG CCGCGCCCGC CCCCAAGCCC GCACCCAAGC 120 

CCGTCCCCGA GCCAGCGCCG ACGCCGAAGG CCGAACCCGC ACCATCGCCG CCGGCGGCCC 180 

AGCCAGCCGG TGCGGCCGAG GGCGCACCGT ACGTGACGCC GCTGGTGCGA AAGCTGGCGT 240 

CGGAAAACAA CATCGACCTC GCCGGGGTGA CCGGCACCGG AGTGGGTGGT CGCATCCGCA 3 00 

AACAGGATGT GCTGGCCGCG GCTGAACAAA AGAAGCGGGC GAAAGCACCG GCGCCCTGAG 360 

CGCTTCATCA CCCGGTTAAC CAGCTTGCCC CAGAAGCCGG CTTCGACCTC TTCGCGGGTC 42 a 

TTGGTCCGCT GCAGGCGGTC GGCGAGCCAG TTCAGGTTAG GCGGCCGAAA TCTTCCAGTT 48 0 

CGCCAGGAAG GGCACCCGGA ACAGGGTCCG CACCC 515 



(2) INFORMATION FOR SEQ ID NO: 221: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 557 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 221: 

CCGACCCCAA GGTGCAGATT CAACAGGCCA TTGAGGAAGC ACAGCGCACC CACCAAGCGC 
TGACTCAACA GGCGGCGCAA GTGATCGGTA ACCAGCGTCA aSSStG SSSS 
GACAGCTGGC GGACATCGAA AAGCTTCAGG TCAATGTGCG CCAAGCCCTG aSctS^CG 
r^rr^r ^CGCTGCCA AGGCCACCGA ATACaSSc SSSSS 

^ CAGCTGGTG ACCGCCGA <* AGAGCGTCGA AGACCTCAAG ACGcSSSS 
SSSS GCTCAGGCCA AGAAGGCCGT CGAACGAAAT GCGATGGTGC 

Irj£S2£ GATCGCCGAG CGAACCAAGC TGCTCAGCCA GCTCGAGCAG GCGAAGATGC 
AGGAGCAGGT CAGCGCATCG TTGCGGTCGA TGAGTGAGCT CGCCGCGCCA GGCAACACGC 

SSSS SET 6 Qmam ™« 

(2) INFORMATION FOR SEQ ID NO: 222: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 223 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:222: 

CAGGATAGGT TTCGACATCC ACCTGGGTTC CGCACCCGGT GCGCGACCGT GTGATAGGCC 

^Sc CGACGATCGA — — C AACAGAAATG gSSSS 

™ GGCACTCGGT GAGAGCGTCA CCGAGGGGAC GGTTACCCGC TGGCTCAAAC 
AGGAAGGCGA CACGGTCGAA CTCGACGAGC CCCTCGTGGA GGT 

(2) INFORMATION FOR SEQ ID NO : 223 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 78 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE; Genomic DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:223: 

IJSJSJS TCTGCCGGTC GATGTCGGCG AACCACGGCA GCCAACCGGC GCAGTAGCCG 
ISSJS? ccgcataacg CCAGTCCCGG CGCACAAACA TACGCCACCC CGCGTATGCC 
rTrrTr^ CCGCCAG ^ CCACATCGCG GGCGTGCCGA CCAGCATCTC GGCCTTGACG 
CGCCGCAGCC TGCAACGTCT TGCTGGTCGA TGGCGTACAG CAcSSgC 
SiSS? GCCAGGTCCA CGGm ^T TCCCAAGGGT GGTAGTTGCC TGCGGAATTC 
?C^SS ™^f G GAACGCmG GCGGTGTATT GCCAGAGCGA GCGCACGGCG 
I^SnS GAACAACCGA GTTGCGACCG ACCGCTTGAC CGACCGCATG CCGATCGATC 
SrSSSS ACGCGAACCA CGGAGCGTAG GTGGCCAGAT AGACCGCGAA CGGGATCAAC 
S^Sg CGCCGCACTG TTCCCAGCCA CGGTCTTTGC 

ACTTGGTATG AACGTCGCGC CGCCACGTCA ACGCCAGC 

(2) INFORMATION ?OR SEQ ID NO: 224: 



60 
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(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH; 484 base pairs* 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(DJ TOPOLOGY : linear 



60 
120 
180 
240 
300 
360 
420 
480 
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(ii) MOLECULE TYPE : Genomic DNA 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 224: 

~ T CO ?~ G G ~£ SCCCCTTCCC CACTTGACCG 

TCAAGGATCG GCTCAAgS ESS?** ACGTCGAGCG CGAGCAGAGC 

GCGATGAAGT CTTGGGCAAA ££2£S SSS?* ^CGCGGAGA 
GCAAGGTCAA GGCGCAGGAG SSrlScCG £££5°" TGAGGCOTG CCAAAGGTGG 
TCGTGGCCTC GGTGACCGTC ScSgS rr£^ T TGCGCCC ^ CCCGCCGCCT 
CCGCCGGCCG ACGATGCGGG SSSgIc SSSSJ AA ° TTOWCT CCGC ^C 
GAAGCGGCCT GACAGGGCCA GCTo£S£ GTACCCCCGC ATACGGGGGA 

GCCC CiCTCACAATT CAGGCCGAAC GCCCCGGTGG GGGGGAACCC 

(2) INFORMATION FOR SEQ ID NO:225: 

(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 537 base pairs" 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22 5: 

SSSSS SSS CCAGCATCTC ™- 

AACGACATGG GCCAGGTCCA ^SJrS SSSf^ TGGCGTACAG CACCGGCCGC 
GTCAGGCCCG CGTGGAAgS gScISSg GGTAGTTGCC TGCGGAATTC 

TCGGGCAGCG GAACAACCGA GTrrrrl^^ SCGGTGTAGT GCCAGAGCGA GCGCACGGCG 
GCGGTCTCGG ACG^cS SaGcS iSS?™* CGACCGCAT = CCGATCGATC ... 

s^skS 2 ^ 5 ^^^ 2.° 

ACGAAGTACA CGCCgSS SSSS SEE SEE EE?" 

(2) INFORMATION FOR SEQ ID NO: 22 6: 

(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 135 amino acids 

(B) TYPE: amino acid 
CO STRANDEDNESS: single 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:226: 
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Gly Gly Ala Ala Ala Gly Gin Gin Ser Asp Val His Pro Phe Ala Asn 
Leu lie Ala Val Asp Asp Glu Arg Ala Glu Arg Arg Asp Asp Glu Glu 
Arg Gin Glu Ala Val Gin Gin Arg lly Pro Arg Gly Asp gL Ala Asp 
Pro val Ala Asp Gin Gin His Pro Gly Asp Gly Ala Asp Gin Cys Arg 

55 £Q 

Pro Ala Asp Pro Pro His Asp Pro His His Gin Arg His Gin Asp His 
Thr Gin Gin Gly Ala lly Glu Pro Pro Ala gL Ser Val Val Thr IL 



95 



Asp Gly Leu Pro Asp Arg Asp Gin Leu Leu Thr Asp Arg Arg Val Asn 

105 110 
Hxs Gin Ala val Pro Gly val Val Phe His Pro Met Val Val Gin His 

120 125 

Leu Pro Gly Leu Ala Val Arg 
130 135 



(2) INFORMATION FOR SEQ ID NO: 227: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 156 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 227: 
Gin Lys Tyr Gly Gly Ser Ser Val Ala Asp Ala Glu Arg lie Arg Arg 
Val Ala Glu Arg n e Val Ala Thr Lys Jys Gin Gly Asn Asp Hi Val 
Val val Val Ser Ala Met Gly Asp ^hr Thr Asp Asp Leu Leu Asp Leu 
Ala Gin Gin Val Cys Pro Ala Pro Pro Pro Arg Glu Su Asp Met Leu 

55 gQ 

Leu Thr Ala Gly Glu Arg He Ser ^ Ala Leu Val ^ ^ ^ ^ 
Glu Ser Leu Gly Ala His Ala Arg Ser Phe Tnr Gly Ser Gin Ala G^y 
Val lie Thr Thr Gly Thr His 81y ^ n ^ a Lys Ile m Thr 

105 no 
Pro Gly Arg Leu Gin Thr Ala Leu Glu Glu Gly Arg Val Val Leu Val 

120 

Ala Gly Phe Gin Gly Val Ser Gin Asp Thr Lys Asp Val Thr Thr Leu 

Gly Arg Gly Gly S er Asp Thr Thr 'Ala Val Ala Met 

145 - - - 



150 155 



(2) INFORMATION FOR SEQ ID NO: 228: 
(i) SEQUENCE CHARACTERISTICS: 
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(Aj LENGTH: 92 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:228: 

Pro Ala Tyr Pro Ala Gly Thr Asn Asn Asp Arg Leu lie Ser Met Arg 

Asp Gly Gly lie Val Ala Leu Pro Gin IL Thr Asp Glu Gin Irg Ala 

Ala Ala Leu Glu Lys Ala Ala Ala £ a Arg Arg Ala Arg Glu Leu 

Lys Asp Arg Leu Lys Arg Gly £ y Thr Asn Leu Thr £ Val ^ Ly3 

Asp Ala Glu Ser Asp Glu Val Leu Gly L ys Met Lys Val Ser Ala Leu 
70 75 



Leu Glu Ala Leu Pro Lys Val Gly Lys Val Gin Ala 



80 



85 90 



(2} INFORMATION FOR SEQ ID NO: 229: 

fi) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 72 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:229: 
Thr Val Glu Leu Asp Glu Pro Leu Val Glu Val Ser Thr Asp Lys Val 
Asp Thr Glu lie Pro Ser Pro Ala Ala Gl'y Val Leu Lys Ue He 

Ala Gin Glu Asp Asp Thr Val Glu Jal Gly Gly Glu Leu Ser Val lie 
Gly Asp Ala His Asp Ala Gly gL Ala Ala Val Pro £. Pro Gin Lys 

Val Ser Ala Gly Pro Thr Arg U e 60 
*5 7 o 



(2) INFORMATION FOR SEQ ID NO:230: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 113 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 230: 
Ala Ala Asp Ile G , y Ma Ma pro ^ ^ ^ ^ ^ ^ ^ 

val P ro Glu Pro Ala Pro Thr Pro Ly3 £ Qlu pro ^ ^ £ ^ 
Pro Ala Ala Gin Pro Ala «x y Ala Ma Glu Gly Ala Pro £ Val Thr 
Pro Leu Val Arg Lys Leu Ala sir Glu Asn Asn Ile Zp Leu Ua Gly 
Val Thr Giy Thr Gly ^ Qly Gly ^ ne ^ « ^ ^ ^ 

Ala Ala Ala Clu Gin Lys Lys ^ Ma Lys £ prQ ^ ^ ^ £ 

90 

Gin Ala Ala Ala Ala Pro Ala Pro Lys Ala Pro Pro Glu Asp 11 Met 
Pro 105 HO 



(2) INFORMATION FOR SEQ ID NO: 231: 

(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 118 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE : protein 

Cxi) SEQUENCE DESCRIPTION: SEQ ID NO : 231: 
Vjl Leu Val Ser lie Ser Ala Asp Glu Asp Ala Thr Val Pro Val Gly 
Gly Glu Leu Ala Arg He Gly Val Ala Ma Asp lie Gly Ala Ma P ro 
Ala Pro Lys Pro Ma Pro Lys Pro Val Pro Glu Pro Ala Pro Thr Pro 
Lys Ala Glu Pro Ala Pro Ser 11 Pro Ma Ma Gin Pro Ma Gly Ma 
Ala Glu Gly Ma Pro Tyr Va! Thr Pro Leu Val Arg Lys Leu Ma Ser 
«lu Asn Asn Ue Asp Leu Ma Gly Val Thr Gl'y Thr Gly Val Gly lly 
AT9 He Arg Lys am Asp Val Leu Ma Ma Ma Glu Gin Lys Lys Arg 



Ala Lys Ala Pro Ala Pro 110 
115 



(2) INFORMATION FOR SEQ ID NO: 232: 

(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 185 ammo acids 

(B) TYPE: ammo acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 
(Xi) SEQUENCE DESCRIPTION : SEQ ID NO:232: 
Asp Pro Lys Val Gin lie Gin Gin Ala lie Glu Glu Ala Gin Arg Thr 

10 1c; 

His Gin Ala Leu Thr Gin Gin Ala Ala Gin Val He Gly Asn Gin Arg 

Gin Leu Glu Met Arg Leu Asn Arg gL Leu Ala Asp He gL Lys Leu 

40 45 
Gin Val Asn Val Arg Gin Ala Leu Thr Leu Ala Asp Gin Ala Thr Ala 

Ala Gly Asp Ala Ala Lys "a Thr Glu Tyr Asn "n Ala Ala Glu Ala 

Phe Ala Ala Gin Leu Val Tnr Ala Glu Gin S ? er Val Glu Asp Leu lis 

Thr Leu His Asp Gin Ala Leu Ser Ala Ala Ala Gin Ala Lys Lys Ala 

X05 

Val Glu Arg Asn Ala Met Val Leu Gin Gin Lys He Ala Glu Arg Thr 

^20 12 S 

Lys Leu Leu Ser Gin Leu Glu Gin Ala Lys Met Gin Glu Gin Val Ser 

Ala ser Leu Arg Ser Met Ser Glu Leu Ala Ala III Gly Asn Thr Pro 

ser Leu Asp Glu Val Arg Asp Lys lie Glu S Arg Tyr Ala Asn 

_ -i _ 165 170 

He Gly Ser Ala Glu Leu Ala Glu Ser 



175 



180 185 
(2) INFORMATION FOR SEQ ID NO: 23 3: 

(ij SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 71 amino acids' 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: procein 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO;233: 



Val 
1 


Ser 


Thr 


Ser 


Thr 
5 


Trp val Pro His Pro Val 


Gly 


Gin 


Arg 


Trp 


Thr 


10 

Cys Ala Asp Arg Arg Ser 


Glu 






20 




25 


Met 


Ala 


Phe 


Ser 


Val Gin Met Pro Ala Leu 


Glu 




35 






40 


Gly Thr 


Val 


Thr 


Arg Trp Leu Lys Gin Glu 




SO 








55 


Leu 


Asp 


Glu 


Pro 


Leu 


Val Glu 


65 










70 



15 

He Glu Glu Sei 
30 

Gly Glu Ser Val 
45 

Gly Asp Thr Val 
60 



(2) INFORMATION FOR SEQ ID NO: 234 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 182 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:234: 
Glu Val His Leu Pro Val Asp Val Gly Glu Pro Arg Gin Pro Thr Gly 

Ala Val Ala Asp Gin Asp His Arg He Thr Pro Val Pro Ala His Lys 

30 

His Thr Pro Pro Arg Val Cys Gin Asp Trp His Arg Gin Pro Pro His 
Arg Gly Arg Ala Asp Gin His Leu Gly Leu Asp Ala Arg Leu Cys Ala 
Ala Ala Cys Asn Val Leu Leu Val Asp Gly Val IL His Arg Pro Gin 

Arg His Gly Pro Gly Pro Arg Phe Gly Phe Pro Arg Val Val Val Ma 

~. , 85 90 95 

Cys Gly He Arg Gin Ala Arg Val Glu Val Glu Arg Phe Gly Gly Val 
100 105 



Leu Pro Glu Arg Ala His Gly Val Gly Gin Arg Asn Asn Arg Val Ala 

120 125 

Thr Asp Arg Leu Thr Asp Arg Met Pro lie Asp Arg Gly Leu Gly Arg 

135 140 



Glu Pro Arg Ser Val Gly Gly Gin lie Asp Arg Glu Arg Asp Gin Pro 
Gin Arg lie Pro Ala Gly Lys His Val Thr III His Cys Ser Gin Pro 



175 



170 

Arg Ser Leu His Leu Val 
180 

(2) INFORMATION FOR SEQ ID NO: 23 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 160 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:235: 

Asn Asp Arg Leu lie Ser Met Arg Asp Gly Gly lie Val Ala Leu Pro 

Gin Leu Thr Asp Glu Gin Arg Ala Ala Sa Leu Glu Lys Ala ^a Ala 

20 25 
Ala Arg Arg Ala Arg Ala Glu Leu Lys Asp Arg Leu Lys Arg Gly Gly 

Thr Asn Leu Thr Gin Val Leu Lys Asp Ala Glu Ser Asp Glu Val Leu 

35 60 
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Gly Lys Met Lys Val Ser Ala Leu Leu alu Ala Leu Pro Lys Val Gly 
I*. Val Lys Ala Gin Glu Ile Met ^ Qlu £ ^ ^ ^ ^ JO 

90 

Pro Ala Ala Phe Val Ala Ser Val Thr Val Ser Ala Arg Pro Cys Trp 
L ys Ser ser Ala Pro Pro Asn Pro 111 Giy Ar g Arg Cys Gly Pro Glu 
Gly Leu Trp Trp Ala Tyr Pro £ Ile Arg G ly Arg Ser Gly Leu Thr 
Gly Pro Ala His Asn Ser Gly Arg Tnr Pro Arg £ Gly Gly Tnr ^ 

150 "5 160 

(2) INFORMATION FOR SEQ ID NO: 236: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 178 amino acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:236: 

Asp Trp His Arg Gin Pro Pro His Arg Gly Arg Ala Asp Gin His Leu 

10 

Gly Leu Asp Ala Arg Leu Cys Ala Ala Ala Cys Asn Val Leu tl val 
Asp Gly Val Gin His Arg Pro Gin Arg His Gly Pro Gly Pro Arg Pne 
Gly Phe Pro Arg Val Val Val "a Cys Gly Ile Arg qL Ala Arg Val 
Glu val Glu Arg Phe Gly Gl'y Val Val Pro Glu Arg Ala His Gly Val 
Gly Gin Arg Asn Asn Arg Val Ala Thr Asp Arg Leu Thr Asp Arg Met 
Pro Ile Asp Arg Gly Leu Gly Arg Glu Pro Arg Ser Val Gly Gly Gin 
II. Asp Arg Glu Arg Asp Gin Pro 111 Arg Xl e Pro Ala ™ hys His 
val T*r Pro His Cys Pro Gin III Arg Ser Leu His HI Val Leu Tnr 
Arg Arg His Val Glu Arg Gin Arg His Arg Ma Glu Glu Gin His 
Glu Val Hi. Ala Gly Pro Leu Gly Gly Ala III Gin Ser Gin Ala Ma 
Pro Arg 170 175 



(2) INFORMATION FOR SEQ ID NO: 23 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 271 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:237: 

ATGCCAAGCC GGTGCTGATG CCCGAGCTCG GCGAATCGGT GACCGAGGGG ACCGTCATTC 60 

GTTGGCTGAA GAAGATCGGG GATTCGGTTC AGGTTGACGA GCCACTCGTG GAGGTGTCCA 12 0 

GGACACCGAG ATCCCGTCCC CGGTGGCTGG GGTCTTGGTC AGTATCAGCG 180 

CCGACGAGGA CGCCACGGTG CCCGTCGGCG GCGAGTTGGC CCGGATCGGT GTCGCTGCCG 240 
AGATCGGCGC CGCGCCCGCC CCCAAGCCCC C 

271 

(2) INFORMATION FOR SEQ ID NO: 238: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 89 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 23 8: 

Ala Lys Pro Val Leu Met Pro Glu Leu Gly Glu Ser Val Thr Glu Gly 

1 5 10 is 

Thr Val lie Arg Trp Leu Lys Lys He Gly Asp Ser Val Gin Val Asp 

20 25 30 

Glu Pro Leu Val Glu Val Ser Thr Asp Lys Val Asp Thr Glu lie Pro 

35 40 45 

Ser Pro Val Ala Gly Val Leu Val Ser lie Ser Ala Asp Glu Asp Ala 

50 55 go 

Thr Val Pro Val Gly Gly Glu Leu Ala Arg lie Gly Val Ala Ala Glu 

65 70 75 so 

He Gly Ala Ala Pro Ala Pro Lys Pro 
85 



(2) INFORMATION FOR SEQ ID NO: 23 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:239: 

GAGGTAGCGG ATGGCCGGAG GAGCACCCCA GGACCGCGCC CGAACCGCGG GTGCCGGTCA 
TCGATATGTG GGCACCGTTC GTTCCGTCCG CCGAGGTCAT TGACGAT 

(2) INFORMATION FOR SEQ ID NO: 24 0: 



60 
107 
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(1) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 339 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) MOLECULE TYPE: Genomic DMA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 240 : 

JSSSSS SSS SSSS ^ TACTGG GGCGCTTGTG 
GGCTATTGCC CG^tSSg aISS ATCA6CCGGA CATGACGAAA 

TACCCCGACG GCTCGTTTTG SSSS f* 8 * 01 ™ CCGTGTGCGA CGGCGAGAAG 
TACTTCGATT GTGTCAGCGG S"*** 0 " GGTTTACCGG CCCACAGTTT 

GCTGGGGCAA TTCcSSS SccSgT 

(2) INFORMATION FOR SEQ ID NO:241: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: H2 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 241: 
Met Lys Leu Lys P he Ala Arg Leu Ser Thr Ala He Leu Gly Cys Ala 
Ala Ala Leu Val ?ne Pro ^ Ser ^ £ ^ ^ ^ ^ £ ^ 

Pro His Gin Pro Asp Met Thr Lys Gly Tyr Cys Pro Gly lly ^ Xrp 
Cly Phe Gly Asp Leu Ala Val Cys ^ Gly Glu Lyg £ prQ ^ ^ 
-jr Trp His Gin Trp Met Gin fc Trp Phe £ Gly Pro Gin Phe 

ryr Phe Asp Cys Val Ser Gly Gly Glu Pro H Pro Gly Pro prQ 

an 

- «y « r uy 01y ^ :le p „ s „ 01u ^ ^ £ ^ 

105 110 
(2) INFORMATION FOR SEQ ID NO: 242: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 3 71 base pairs" 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(DJ TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:242: 
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cSSS clSScc J?"™ «™«* .. 

GTGGCAATGG CGGtSSS SSS^rn ^GGCMG GGCGGCGCCG 120 

CGGGCGCCCC CGGCGGCAAC ^CAGCGG CCCCGCCTCC ATCGGGGTCA 180 

CAGGTGGCGA CGGCGgS^ SSf™ CCCAACGGCT 240 
GCGCCAACAG CGGCATC^ GGGGTCCCGG CGGCAACGGG GGCTCGATCG 
GAAACGGCAG C GGCGGTTCCG GTGGGGCCGG TGGCGCTGGC GGCGCCGGCG 



(2) INFORMATION FOR SEQ ID NO: 243: 

(i) SEQUENCE CHARACTERISTICS * 

(A) LENGTH: 424 base pairs' 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 243 : 



SSSSS SEES SSSS J 6 ™ 0 "" ^cccttttc 

ATGGGCAACA Ttt£JE£ SSSS TGTTGGCATG ATCGTGACCC 

GGTGGTGAGC ATCGGTCTAG SSSSgC TCCATGCGAA TCGCCGCCGC 

CCCGTCGGAG CCCGGGGTTG ScSlcGC f CCGACGCAC * 

CGTCGGCGCC CCAATGGGGT GGGaSS^ AAGGGGTCGG TCGGCAACAT 

CGAACTACCG GCGTGCAaS CCGTTCCAGG CGTTTTGGGT 

CGAC ACTGGo.GGA catcgggctg CCCGAGGTGT ACGACGATCC 

(2) INFORMATION FOR SEQ ID NO: 244: 

(i) SEQUENCE CHARACTERIST T CS * 

(A) LENGTH : 317 base oalrs ' 

(B) TYPE: nucleic acid 
<CJ STRAND EDNESS : single 
(D) TOPOLOGY: linear 

(il) MOLECULE TYPE: cDNA 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 244: 

SSSSS S™"™" TTCCCAACCC aUBKCCr 

AGGGCGGATG CGATcS^ SSSS ^^^^ TCACCGAGGG CCACCGTCTA 
TGGGACTTGG TGGAgSS gSgcSS ^ACCGGCTGC CTTTCGCCGA GCCGCCGGAT 
GTCATCGCCG AoSSSS ^ CGTCACCG GCGACACGGT GCGCATCGAC 

CGGCTCTACG ATTCGTC TCCCGAACTG GCGGCGGCGT CCAAACTCAC CGAATCGCTG 

(2) INFORMATION FOR SEQ ID NO: 245: 

<i> SEQUENCE CHARACTERISTICS • 
<A) LENGTH: 422 base pairs' 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 



300 
360 
371 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE : cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:245: 

TGGCGTATGC GCTTCGCAGC CGGTGCCGCG TCAACGCGCC GGAGGCAATC GCTTCGCTGC 60 

CGAGGAATGG TTCGATCACG ATCGCAGTGT GCCGTCGTGC ACCGACACCG CCGTCCAACG 120 

TGAACTGAGG GCGGAAAATC GGCCGAAATC TCGCCCTCAG TTCACGCTCG GCGCCTAACG 180 

GTTCTGGAAG TTGGGTGCGC GCTTCTCGGC GAACGCGCGC GGGCCTTCCT TGGCGTCGTC 240 

GGACAGGAAG ACCTTGATGC CGATCTGGGT GTCGATCTTG AACGCCTCGT TTTCGGGCAT 300 

GCACTCGGTC TCGCGGATGG ACCGCAAGAT GGCCTGCACG GCCAGGGGTC CGTTAGCCGA 360 

GATGGCGTCG GCAAGTTCTA GAACCTTGGT CAACGCCTGG CCGTCGGGCA CACGTGGCCG 420 

AT 422 

(2) INFORMATION FOR SEQ ID NO: 246: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 426 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE : cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 246: 

GCGTGCCGCT GAACACCAGC CCGCGGCTGC CAGATCTCCC GGACTCGGTA GTGCCGCCGG 60 

TGGCGTCGTT GCTCTCCTGA CGGGGCGCGG CGACCATAAG GTCGCTAATG CCCAGGTAGC 120 

GGCCCAGGTG CATGGAGTCG ATGATGATGC GACTCTCCAG CTCGCCGACC GGGAGCTTGG 180 

CATCGGGCCT GATCAGCCAG GACGCGTAGG ACAAGTCGAT CGAATGCATA GTGGCCTCCA 240 

GAGTGGCCGT GCCACTTCCG GCGTGCTCCA CGGCAAATGC CTTGATTTCT AGCTCCGCGT 300 

AGTGTTCCCG CATCGCCTGC GGGATGAATG GGAACCGCAG GATGGCGACA AACGGGTCTG 360 

ACCTCAGGTT TGCCGCTTTG CGCACAGTGG TCGACAGCCG GTACTCGGCA TAAATGCTGG 420 
CCCCGA 



(2) INFORMATION FOR SEQ ID NO: 24 7: 



426 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 327 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 247: 

AGACCGGCGA GGGTGTGGTC GCTGCCCGCG GCATTGTCGA TAATCTGCGC TGGGTCGACG 60 

CGCCGATCAA CTAGTGAGGC GCAACGCTAG GCTTTGGGAT ACCCACAGCT AAAAAGTTTA 120 

TCAAAGAAAC GAAGAAGGTT GCCATGAGCA CTGTTGCCGC CTACGCCGCC ATGTCGGCGA 18 0 

CCGAACCCCT GACCAAGACC ACGATCACCC GTCGCGACCC GGGCCCGCAC GACATGGCGA 24 0 

TCGACATCAA ATTCGCCGGA ATCTGTCGCT CGGACATCCA TACCGTCCAA ACCGAATGGG 300 

GGCAACCGAA TTTACCTGTG GTCCCTG 327 
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(2) INFORMATION FOR SEQ ID NO: 248: 

£i} SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 123 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

Cii) MOLECULE TYPE : protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 248: 

Asp His Gly Gly Pro Ala Thr Asn Pro Gly Ser Gly Ser Arg Gly Gly 

10 15 
Ala Gly Gly Ser Gly Gly Asn Gly Gly Ala Gly Gly Asn Ala Thr Gly 

25 30 
Ser Gly Gly Lys Gly Gly Ala Gly Gly Asn Gly Gly Asp Gly Ser Phe 

Gly Ala Thr Ser Gly Pro Ala Ser He Gly Val Thr Gly Ala Pro Gly 

Gly Asn Gly Gly Lys Gly Gly Ala Gly Gly Ser Isn Pro Asn Gly Ser 

Gly Gly Asp Gly Gly Lys Gly Gly Asn Gly Gly Ala Gly Gly Asn G^y 

_ 85 90 95 

Gly Ser lie Gly Ala Asn Ser Gly He Val Gly Gly Ser Gly Gly Ala 

100 no 
Gly Gly Ala Gly Gly Ala Gly Gly Asn Gly Ser 

120 

(2) INFORMATION FOR SEQ ID NO: 24 9: 

Ei) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 104 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24 9: 



10 15 
Asp Phe Ser Pro Ala Asp Ph< 
30 

Asp Ala He Leu Leu Arg Arc 
45 

Pro Asp Trp Asp Leu Val Gli 
60 

Asp Thr Val Arg He Asp Va; 

75 80 
Ala Ala Ala Ser Lys Leu Tiu 
90 95 



Met 

1 


Ala 


Ala 


Ala Gly Thr Thr 

c 


Ala 


Asn 


Asn 


Asp 


Pro 


Leu His 


Leu 


Ala 


Ser 


He 


Val 






20 








25 


Thr 


Glu 


Gly His 


Arg 


Leu 


Arg 


Ala 


Thr 




35 








40 




Asp Arg 


Leu Pro 


Phe 


Ala 


Glu 


Pro 




50 








55 






Ser 


Gin 


Leu 


Arg Thr 


Thr 


Val 


Thr 


Ala 


65 








70 






He 


Ala 


Asp 


Asp Met 


Arg 


Pro 


Glu 


Leu 








85 










Glu 


Ser 


Leu 


Arg Leu 


Tyr 


Asp 


Ser 





100 
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(2) INFORMATION FOR SEQ ID NO:250: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:250: 

Ala Tyr Ala Leu Arg Ser Arg Cys Arg Val Asn Ala Pro Glu Ala He 

15 10 15 

Ala Ser Leu Pro Arg Asn Gly Ser He Thr He Ala Val Cys Arg Arg 

20 25 30 

Ala Pro Thr Pro Pro Ser Asn Val Asn 
35 4 0 

(2) INFORMATION FOR SEQ ID NO: 2 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 251: 

Val Pro Leu Asn Thr Ser Pro Arg Leu Pro Asp Leu Pro Asp Ser Val 

5 10 is 

Val Pro Pro Val Ala Ser Leu Leu Ser 

20 25 
(2) INFORMATION ?OR SEQ ID NO: 252: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 61 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:252: 

Met Ser Thr Val Ala Ala Tyr Ala Ala Met Ser Ala Thr Glu Pro Leu 

15 10 15 

Thr Lys Thr Thr He Thr Arg Arg Asp Pro Gly Pro His Asp Met Ala 

20 25 30 

He Asp He Lys Phe Ala Gly n e Cys Arg Ser Asp He His Thr Val 
35 40 45 
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Gin Thr Glu Trp Gly Gin Pro Asn Leu Pro Val Val Pro 
50 55 60 

(2) INFORMATION FOR SEQ ID NO: 25 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 213 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:253: 

GCTTGGAGCC CTGGAGCGAC GGTGTGGGTC TGGGGGTCGA TTCGTTCTCG GCGAAAGTCA 60 

ACTAAAGACC ACGTTGACAC CCAACCGGCG GCCCGGCATG GGCCGTCGCG GCGTAGAAGC 120 

TTTGACCGCG GCGCGAAACG TTC3CTGCTG CGGCCCATGC AGATCGCACA CGCTTGCTTG 180 

AACATCGGGT GGAGCCGGTG GTAACGCCAG GCT 213 

(2) INFORMATION FOR SEQ ID NO: 254: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 367 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 254: 

CCGAGCTGCT GTTCGGCGCC GGCGGTGCGG GCGGCGCGGG TGGGGCGGGC ACCGACGGCG 6 0 

GGCCCGGTGC TACCGGCGGG ACCGGCGGAC ACGGCGGAGT CGGCGGCGAC GGCGGATGGC 12 0 

TGGCACCCGG CGGGGCCGGC GGGGCCGGCG GGCAAGGCGG GGCAGGTGGT GCCCGCAGCG 180 

ATGGTGGCGC GTTGGGTGGT ACCGGCGGGA CGGGCGGTAC CGGCGGCGCC GGTGGCGCCG 240 

GCGGTCGCGG CACACTGCTG CTGGGCGCTG GCGGACAGGG CGGCCTCGGC GGCGCCGGCG 300 

GACAAGGCGG CACCGGCGGG GGCCGGCGGA GATGGCGTTC TGGGGGGTGT CAGTGGCACT 36 0 

GGTGGTA 367 

(2) INFORMATION FOR SEQ ID NO: 25 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 420 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 255: 

AAGGCGTGAT TGGCAAGGCG ACCGCGCAGC GGCCCGTAGC CGCGGGACGG CCCAGGCCCC 60 
GACCGCAGCG GCCGGTGTCT GACCGGGTCA GCGACCAGCG GCGCTGACCG TGCCGCTCGT 12 0 
CTACTTCGAC GCCAGCGCCT TCGTCAAACT TCTCACCACC GAGACAGGGA GCTCGCTGGC 18 0 
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GTCCGCTCTA TGGGACGGCT GCGACGCCGC ATTGTCCAAC CGCCTGGCCT ACCCCGAAGT 24 0 

CCGCGCCGCA CTCGCTGCAA CGGGCCGCAA TCACGACCTA ACCGAATCCG AGCTCGCCGA 300 

CGCCGAGCGT GACTGGGAGG ACTTCTGGGC CGCACCCGCC CAGTCGAACT CACCGCGACG 360 

GTTGAACAGC ACGCCGGGCA CCTCGCCCGA ACACATGCCT TACGCGGAGC CGACACCGTT 420 

(2) INFORMATION FOR SEQ ID NO: 256: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 99 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 256: 

CTCTTGTCGG TGGCATCGGC GGTACCGGCG GAACCGGCGG CAACGCCGGT ATGCTCGCCG 60 

GCGCCGCCGG GGCCGGCGGT GCCGGCGGGT TCAGCTTCAG CACTGCCGGT GGGGCTGGCG 120 

GCGCCGGCGG GGCCGGTGGG CTGTTCACCA CCGGCGGTGT CGGCGGCGCC GGTGGGCAGG 180 

GTCACACGGG CGGGGCGGGC GGCGCCGGCG GGGCCGGCGG GTTGTTTGGT GCCGGCGGCA 240 
TGGGCGGGGC GGGCGGATTC GGGGATCACG GAACGCTCGG CACCGGCGGG GCCGGCGGG 



(2) INFORMATION FOR SEQ ID NO: 257: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

Cxi) SEQUENCE DESCRIPTION : SEQ ID NO:257: 
Leu Glu Pro Trp Ser Asp Gly Val Gly Leu Gly Val Asp Ser Phe Ser 

Ala Lys Val Asn 
20 

(2) INFORMATION FOR SEQ ID NO: 258: 

Ci) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 121 amino acids 

(B) TYPE: ammo acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION^ SEQ ID NO:258: 

Glu Leu Leu Phe Gly Ala Gly Gly Ala Gly Gly Ala Gly Gly Ala Gly 

15 10 15 

Thr Asp Gly Gly Pro Gly Ala Thr Gly Gly Thr Gly Gly His Gly Gly 



299 
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20 


25 


30 


Val 


Gly 


Gly 


Asp Gly Gly Trp 


Leu Ala 


Pro Gly Gly Ala Gly Gly Ala 






35 




40 


45 


Gly 


Gly 


Gin 


Gly Gly Ala Gly 


Gly Ala 


Arg Ser Asp Gly Gly Ala Leu 




50 




55 




60 


Gly 


Gly 


Thr 


Gly Gly Thr Gly 


Gly Thr 


Gly Gly Ala Gly Gly Ala Gly 


65 






70 




75 80 


Gly 


Arg 


Gly 


Thr Leu Leu Leu 


Gly Ala 


Gly Gly Gin Gly Gly Leu Gly 








85 




90 95 


Gly 


Ala 


Gly 


Gly Gin Gly Gly 


Thr Gly 


Gly Gly Arg Arg Arg Trp Arg 








100 


105 


110 


Ser 


Gly 


Gly 


Cys Gin Trp His 


Trp Trp 








115 




120 





(2) INFORMATION FOR SEQ ID NO:259: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 34 amino acids 
(3) TYPE : amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25 9: 

Gly Val He Gly Lys Ala Thr Ala Gin Arg Pro Val Ala Ala Gly Arg 

15 10 15 

Pro Arg Pro Arg Pro Gin Arg Pro Val Ser Asp Arg Val Ser Asp Gin 
20 25 30 

Arg Arg 



(2) INFORMATION FOR SEQ ID NO: 260: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 99 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 260: 



Leu Val Gly Gly He 

1 5 
Met Leu Ala Gly Ala 
20 

Ser Thr Ala Gly Gly 
35 

Thr Thr Gly Gly Val 
50 

Ala Gly Gly Ala Gly 
65 



Gly Gly Thr Gly Gly 
10 

Ala Gly Ala Gly Gly 
25 

Ala Gly Gly Ala Gly 
40 

Gly Gly Ala Gly Gly 
55 

Gly Ala Gly Gly Leu 
70 



Thr Gly Gly Asn Ala Gly 
15 

Ala Gly Gly Phe Ser Phe 
30 

Gly Ala Gly Gly Leu Phe 
45 

Gin Gly His Thr Gly Gly 
60 

Phe Gly Ala Gly Gly Met 
75 80 
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Gly Gly Ala Gly Gly Phe Gly Asp His Gly Thr Leu Gly Thr Gly Gly 
Ala Gly Gly 



85 90 95 



(2) INFORMATION FOR SEQ ID NO: 261: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 282 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

Cii) MOLECULE TYPE: cDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:261: 

TCCTGTTCGG CGCCGGCGGG GTGGGCGGTG TTGGCGGTGA CGGTGTGGCA TTCCTGGGCA 60 
CCGCCCCCGG CGGGCCCGGT GGTGCCGGCG GGGCCGGTGG GCTGTTCAGC SSSE 
CCGGCGGCGC CGGCGGAATC GGATTGGTCG GGAACAGCGG TGCCGGGGGG TCCGGCGGGT 
CCGCCCTGCT CTGGGGCGAC GGCGGTGCCG GCGGCGCGGG TGGGGTCGGG TCCACTACCG 
GCGGTGCCGG CGGGGCGGGC GGCAACGCCA GCCTGCTGGT AA 



120 
180 
240 
282 



(2) INFORMATION FOR SEQ ID NO: 262: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 415 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

Cii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:262: 

CGGCACGAGC CGTGCTACTG GTCAACTGAT GCCCTGATTG TGACCTTCCC GGCGCCGGAT 50 
CAGTGCTTCT CAGGACCGAC GTAATATTCG AAAACCAATC CGGCCGCCGA GGCGAGGATG 120 
AATGCCACAC CGGCGGCGAT CAGCCACGGG AGCCACAACG CGATGCCGAC CGCTGCCACC 
GAGCCGGACA ACGCGACCAT GATCGGCCAC CAGCTATGCG GACTGAAGAA TCCAAGTTCT 
CCTGCGCCGT CGCTGATTTC AGCGCCTTCG TAGTCCTCGG GCCGGGAATC TAACCGGCGG 
GCCACAAACC GGAAGAAGGT GGCGACGATC AACGCCATGC CGCCGGTGAG CGCCAACGCA 
ATGGTGCCAG CCCACTCGAC ACCACCGGTG GCGAACATCG AGGTCAACAC GCCGT 



180 
240 
300 
360 
415 



(2) INFORMATION FOR SEQ ID NO:263: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 73 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:263: 



WO 99/42076 



229 



PCT/US99/03268 



£S££ SSS°" A CGGCCTCTTC ACTGCGCGCT "S 

ACCTGACGCT CCT ATCGGTTTGA ACGTCATCGC AATTCCCGCA ATGGGTGAGT 360 

373 

(2) INFORMATION FOR SEQ ID NO: 264: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 423 base pairs 

(B) TYPE: aucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

!ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 264: 

SS £2555 «»™»e 



££££ G,u:cic<!GC0 TSM "^ «SS£ ZSS 

SSSS "SSS = ™ 
SESS 3525= S ™ ™- S£S JS££ 
= ~ SEES SSS? S= 2SSS 



(2) 



INFORMATION FOR SEQ ID NO: 26 5: 



(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 404 base oairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:265: 



SSScC ATGCATCCAG CTCCCGGTAC GTCAGCTGAC CATCCGCCCA 

SSccSc GCTGTGCCGC AGCGATTTCG GCGAACCGGG TATGCACCGC 

SS^Sg SSSSJ GCGGCAGGCC <^TGCGGTC GGATCGTGCT cgccgtccag 

S^CGC !S5 CCGATC CC ^CGGCTG ACCAAGCGCT GTAACACAGC 

SSSgc £££££ ™ GGGCGC AGCGCACCGT CGAGCACCTC 

SSctSct agcg?^ ^ CTGCG ctgcgcggcg acggtcaccg GAAAGTGCGA 

wiAAu I LTCT AGCGCCACCG GACGGAACGT CACCCCGTTT GCGA 
(2) INFORMATION FOR SEQ ID NO: 266: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 421 base pairs 

(B) TYPE: nucleic acid 
CO STRANDEDNESS: single 



60 
12 0' 
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240 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 266: 

GTCCTGGTCG CAGGC7GTTC TTCGAACCCG CTGGCTAACT TCGCACCCGG GTATCCGCCC so 

ACCATCGAAC CCGCCCAACC GGCGGTGTCA CCGCCTACTT CGCAAGACCC GGCCGGTGCA 120 

GTGCGACCAC TGAGCGGCCA CCCCCGGGCG GCACTATTCG ACAACGGCAC CCGCCAATTG 180 

GTGGCTCTGC GCCCGGGCGC CGATTCGGCG GCACCCGCCA GCATCATGGT CTTCGATGAC 240 

ATGCACGTTG CACCGCGCGT CATTTTTCTG CCGGGCCCGG CAGCCGCGTT GACCAGCGAC 300 

GACCACGGCA CGGCCTTCCT TGCCGCCCGC GGCGGCTACT TCGTGGCCGA CCTGTCCTCC 360 

GGTCACACCG CACGAGTGAA TGTCGCTGAC GCAGCGCACA CCGATTTCAC CGCGATCGCC 420 

° 421 
(2) INFORMATION FOR SEQ ID NO: 267: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 426 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 267: 

ATGCATATCA CGCTCAACGC CATCCTGCGT GCGATCTTCG GGGCCGGCGG CAGTGAACTA 60 

GACGAGCTGC GCCGCCTCAT TCCGCCGTGG GTCACGCTGG GCTCGCGCCT GGCGGCGCTA 12 0 

CCGAAACCCA AACGCGACTA TGGCCGCCTT AGCCCGTGGG GCCGGCTGGC CGAGTGGCGG 180 

CGCCAGTACG ACACTGTCAT CGACGAGCTC ATCGAAGCCG AGCGGGCCGA CCCGAACTTC 240 

GCCGATCGGA CC3ACGTTTT GGCGTTGATG CTGCGCAGCA CTTACGACGA CGGTTCCATC 300 

ATGTCGCGCA AGGACATTGG CGACGAACTG CTCACGCTGC TTGCCGCCGG GCACGAAACC 360 

ACGGCGGC3A CATGGGCTGG GCGTTCGAAC GGCTCAACCG GCACCCCGAC GTGCTCGCGG 420 
CTCTGG 

426 



(2) 



INFORMATION FOR SEQ ID NO: 26 8: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 522 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: CDNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 268: 

GTCCTGGTCG CAGGCTGTTC TTCGAACCCG CTGGCTAACT TCGCACCCGG GTATCCGCCC 6 0 

ACCATCGAAC CCGCCCAACC GGCGGTGTCA CCGCCTACTT CGCAAGACCC GGCCGGTGCA 120 

GTGCGACCAC TGAGCGGCCA CCCCCGGGCG GCACTATTCG ACAACGGCAC CCGCCAATTG 180 

GTGGCTCTGC GCCCGGGCGC CGATTCGGCG GCACCCGCCA GCATCATGGT CTTCGATGAC 24 0 

GTGCACGTTG CACCGCGCGT CATTTTTCTG CCGGGCCCGG CAGCCGCGTT GACCAGCGAC 300 

GACCACGGCA CGGCCTTCCT TGCCGCCCGC GGCGGCTACT TCGTGGCCGA CCTGTCCTCC 360 

GGTCACACCG CACGAGTGAA TGTCGCTGAC GCAGCGCACA CCGATTTCAC CGCGATCGCC 42 0 
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SSSS S~ ™- £™ c,™ „„ 

(2) INFORMATION FOR SEQ ID NO:269: 

(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 739 base pairs* 

(B) TYPE: nucleic acid 
(O STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO:269: 

SSSS SEE ™S sss? 2— « 

120 

CCATGCAGCC GGoSSS ATCA^TcS SSSSJ " CCG ««» <*CTCGTCGT 240 
CGGACTGCCA GGGCGCGCTG tJSSSS ^1^° GCCGG ™CG GTGTCCCTGC 



AUCCGTGGCC CTTAGTGGCr rrnnnr**r*~~ -^llu, utaUAAAACCA 

TCGGCATCTG SSStc CGGcSaS ^TCGTGCT CGTCCTCGTG TTGGGCGCCA 
AGCGCCTTAG £££££ SSS^ GCCTCAGCCG ^GCGGAGG 180 

CCATGCAGCC GGGCAAACCG AwSiSSJ CBGAWSTCaA CGCCGTGATG GGCTCGTCGT 
CGGACTGCCA SSctSc JSSSS GTGTCCCTGC 
CCGCCATCAA CGGCTTGATT TcSS^S ^TCCGGT GTATGCCGGC ACCGGCTACA 
AAGCCGTCGT CGCcSS IcCGCcSS SSS^ CTAC «* ai ^GGTGAACC 420 
ACAAATGGAA GAACTGCgS cScSScS JSSSSS GTTCGTGCAG ^CGGCCG 480 
GGTGGACGTT TGCCGACGTC AaSJSS GAATAAGGCC AAGACCTACC 540 

AAGGCGCTGA GGGCTGGGAA iSJSSSS CGCCGACGAT CACGGTGATA GACACCCAAG 600 
ACGTCAACGC ScSt^ SgaISS GTGGTTGTCG 66 S 

GTTGACAAAG TCAACAAGG ATCAAGCAGG CCAGATCGCC GCCAAGATCT 



(2) INFORMATION FOR SEQ ID NO: 270: 

(i) SEQUENCE CHARACTERISTICS • 

(A) Length. 69 base oairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE : cDNA 

(X!) SEQUENCE DESCRIPTION : SEQ ID NO: 270: 
StIt C G G C C <M8Cra TCGCCCGCGC GGA ^CGTT AACCCGGCAC TGAACGCGTT 

£2) INFORMATION FOR SEQ ID NO:271: 

(i) SEQUENCE CHARACTERISTICS - 

(A) LENGTH: 52 3 base oairs 

(B) TYPE: nucleic acid 
(O STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

<Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 271: 



720 
739 



60 
69 
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£££££ SSSSJ CCAACGGATC GGGTCAACTA GCACTGCCGG TGGAGGCGCC 

■■■■■■ 

(2) INFORMATION FOR SEQ ID NO: 272: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 224 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 
CD) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO;272: 

SEE SEE EE SJ EE see 

= EE = EE ST" 1C ™ 

(2) INFORMATION FOR SEQ ID NO: 273: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 521 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:273: 



SO 
120 
180 
240 
300 
360 
420 
480 
523 



60 
120 
180 
224 



60 
120 
180 
240 
300 
360 



Z G^cS£ SSSS GCSGC3T ^ ATAGCTOCGC CGCCAGGCCA 

Z-T TTCG^TAG CGGGCCTTGG TCTCGGCCTT GTCCAAACCC TBOrrrrm 

EE EE SET* GCCGCicraj ^SSS 

rZrZZT C CAGATGC,jCG GTGGTGATCG CGCGCCGCAG CAACGAGGTG TAGAGOvrrT 

ee E252 »S ICAGCT caoo,,TOo EE 

ee = see f™ ™—» H! 

(2) INFORMATION FOR SEQ ID NO: 274: 

<i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 426 base pairs" 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO.-274: 

£cg?S5g ESSES ^ GCCAC ccggc <*™ AGCGGACGCC CCCATTCGTT 
GCGAACCAGC CcSSccS CGAACCTCCG CCCAGGG^cI 

CGCGAAGGGT SScSrl SSS?* TCAGAACGGT AGTGCACGAC AGTCTCGCCG 
CCGAGCTGAG £££££ SSSS JS^" CCGACGAGGC GTGGATCGCC 
TCGCTGGTCT TCTTCCcSS SgS^ GTTC ^CAC CCAGTCCACC 

GTGACCCCGA ACGCcSS Sr^Sf GTGAACTTGA CCGCGTCGAC ATCGGCGCGG 
TCGCCA ACGCCGGCAG CGTCGTCGCC GTCGTCGCCC GCGGCAGGGG CGGCAACTGC 



(2) INFORMATION FOR SEQ ID NO: 275: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 219 base pairs' 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE : cDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:275: 

SSSS S= S 00 ^ 03 coctMaKC »«•»« 

GGCGGTGGTA CCoSSS "ccS^S ^KCGGCGC CGGCGGK^c GGCGGGGCAA 
««„»: „£££ SSSSSS SSSf GCATGGGCGC COC^c 

(2) INFORMATION FOR SEQ ID NO: 2 76: 

Ci) SEQUENCE CHARACTERISTICS ■ 

(A) LENGTH: 57i base pairs' 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE : cDNA 

76 : 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 



ctsScc ssss sssr tgcgctctgc atcgtcgccg gcgc — 

ATCCTTTTCG SSgGCCC gSS™ M*™** ACCACAGTCG GGTTCTCGGG 
TGGGTGCCGG £££££ SSSS SS™ 3 ^ TTCACCM « TGGTATCGGC 
CTTGCGCGCC AATGCGACCA aSggTcgK CGAGTAAACA CCTGGCGCGG 

GTTGCGAGTG ACCTTGTGCG TKaUXGGT ^TCAACGAGA TCTGCTCACC 

GGCGGGTTCG To£S££ SSr^ <aTTT ™» ^GGCGATGG ACAGCATCTC 
CTTGAACGGC t££££5 ScSSr ^ CCGCACC <^GCAACCA GCTCCTTGGG 
GGTGTCGGTC TTTGCGGTGA aclrrlr^ AGACCCAGCA CCACATCCAC 



60 
120 
180 
240 
300 
360 
420 
426 



60 
12 0 
180 
219 
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(2) INFORMATION FOR SEQ ID NO: 2 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 93 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 277: 

Leu Phe Gly Ala Gly Gly Val Gly Gly Val Gly Gly Asp Gly Val Ala 

15 10 15 

Phe Leu Gly Thr Ala Pro Gly Gly Pro Gly Gly Ala Gly Gly Ala Gly 

20 25 30 

Gly Leu Phe Ser Val Gly Gly Ala Gly Gly Ala Gly Gly lie Gly Leu 

35 40 45 

Val Gly Asn Ser Gly Ala Gly Gly Ser Gly Gly Ser Ala Leu Leu Trp 

, 50 55 go 

Gly Asp Gly Gly Ala Gly Gly Ala Gly Gly Val Gly Ser Thr Thr Gly 

, 70 75 80 

Gly Ala Gly Gly Ala Gly Gly Asn Ala Ser Leu Leu Val 
85 g 0 

(2) INFORMATION FOR SEQ ID NO: 278: 

£i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 278: 

Met Pro Pro Val Ser Ala Asn Ala Met Val Pro Ala His Ser Thr Pro 

5 1° 15 

Pro Val Ala Asn lie Glu Val Asn Thr Pro 

20 25 
(2) INFORMATION FOR SEQ ID NO: 2 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 279: 
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Lys Pro Asp Arg Pro Ala Ala Thr 

1 5 
Ala Pro Cys Ser Gin Pro Val Thr 
20 



Val Gly Ser Cys Thr Thr Val Arg 

10 15 
Thr Ala 
25 



(2) INFORMATION FOR SEQ ID NO: 280: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 280: 

Trp Pro Ala Gly Arg Pro Met His Pro Ala Pro Gly Thr Ser Ala Asp 

1 5 10 is 

His Pro Pro Asn 

20 



(2) INFORMATION FOR SEQ ID NO: 281: 



(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 140 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ 



Val Leu Val Ala 

Gly Tyr Pro Pro 
20 

Thr Ser Gin Asp 
35 

Arg Ala Ala Leu 
50 

Pro Gly Ala Asp 
65 

Met His Val Ala 

Leu Thr Ser Asp 
100 

Tyr Phe Val Ala 
115 

Ala Asp Ala Ala 
130 



Gly Cys Ser Ser 
5 

Thr He Glu Pro 

Pro Ala Gly Ala 
40 

Phe Asp Asn Gly 
55 

Ser Ala Ala Pro 
70 

Pro Arg Val He 
85 

Asp His Gly Thr 

Asp Leu Ser Ser 
120- 

His Thr Asp Phe 
135 



ID NO: 281: 

Asn Pro Leu Ala 

10 

Ala Gin Pro Ala 

25 

Val Arg Pro Leu 

Thr Arg Gin Leu 
SO 

Ala Ser He Met 
75 

Phe Leu Pro Gly 
90 

Ala Phe Leu Ala 
105 

Gly His Thr Ala 

Thr Ala He Ala 
140 



Asn Phe Ala Pro 
15 

Val Ser Pro Pro 

30 

Ser Gly His Pro 
45 

Val Ala Leu Arg 

Val Phe Asp Asp 
80 

Pro Ala Ala Ala 
95 

Ala Arg Gly Gly 
110 

Arg Val Asn Val 
125 



(2) INFORMATION FOR SEQ ID NO: 282: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 142 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:282: 

Met His lie Thr Leu Asn Ala lie Leu Arg Ala He Phe Gly Ala Gly 

15 10 15 

Gly Ser Glu Leu Asp Glu Leu Arg Arg Leu He Pro Pro Trp Val Thr 

20 25 30 

Leu Gly Ser Arg Leu Ala Ala Leu Pro Lys Pro Lys Arg Asp Tyr Gly 

35 40 45 

Arg Leu Ser Pro Trp Gly Arg Leu Ala Glu Trp Arg Arg Gin Tyr Asp 

50 55 60 

Thr Val He Asp Glu Leu lie Glu Ala Glu Arg Ala Asp Pro Asn Phe 

70 75 
Ala Asp Arg Thr Asp Val Leu Ala Leu Met Leu Arg Ser Thr Tyr Asp 

85 90 95 

Asp Gly Ser He Met Ser Arg Lys Asp He Gly Asp Glu Leu Leu Thr 

100 105 110 

Leu Leu Ala Ala Gly His Glu Thr Thr Ala Ala Thr Trp Ala Gly Arg 

lis 120 125 

Ser Asn Gly Ser Thr Gly Thr Pro Thr Cys Ser Arg Leu Trp 
130 135 140 

(2) INFORMATION FOR SEQ ID NO: 28 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 163 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 





(ii) 


MOLECULE 


TYPE : protein 












(xi) 


SEQUENCE 


DESCRIPTION: SEQ ID 


NO: 


283 : 




Val 


Leu 


Val 


Ala 


Gly 


Cys 


Ser 


Ser Asn 


Pro 


Leu 


Ala 


Asn 


1 








5 








10 








Gly 


Tyr 


Pro 


Pro 


Thr 


He 


Glu 


Pro Ala 


Gin 


Pro 


Ala 


Val 


Thr 






20 








25 










Ser 


Gin 


Asp 


Pro 


Ala 


Gly 


Ala Val 


Arg 


Pro 


Leu 


Ser 


Arg 


Ala 


35 










40 








45 


Ala 


Leu 


Phe 


Asp 


Asn 


Gly Thr 


Arg 


Gin 


Leu 


Val 




50 










55 






60 




Pro 


Gly 


Ala 


Asp 


Ser 


Ala 


Ala 


Pro'Ala 


Ser 


He 


Met 


Val 


65 










70 








75 


Val 


His 


Val 


Ala 


Pro 


Arg 


Val 


He Phe 


Leu 


Pro 


Gly 


Pro 










85 








90 






Leu 


Thr 


Ser 


Asp 


Asp 


His 


Gly 


Thr Ala 


Phe 


Leu 


Ala 


Ala 








100 








105 











15 

Ser Pre 
30 



80 

Ala Ala Ale 
95 

Arg Gly Glj 
110 
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Tyr Phe Val Ala Asp Leu Ser Ser Gly His Thr Ala Arg val Asn Val 

15 120 125 

Ala Asp Ala Ala His Thr Asp Phe Thr Ala lie Ala Arg Arg Ser Asp 
„ T 13 ° T " S 140 

G.y Lys Leu Val Leu Gly Ser Ala Asp Gly Ala Val Tyr Thr Leu Ala 
150 155 

Lys Asn Pro 160 



(2) INFORMATION FOR SEQ ID NO: 2 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 240 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 284: 

Trp Gly Ala Pro Pro Ser Gly Gly Pro Ser Pro Trp Ala Gin Thr Pro 

Arg Lys Thr Asn Pre Trp Pro Leu Val A^a Gly Ala Ala Ala Val Val 

Leu Val Leu Val Leu Gly Ala lie Gly He Trp lie Ala He Arg Pro 

40 45 
Lys Pro Val Gin Pro Pro Gin Pro Val Ala Glu Glu Arg Leu Ser Ala 

50 55 60 

Leu Leu Leu Asn Ser Ser Glu Val Asn Ala Val Met Gly Ser Ser Se>- 
70' 75 80 " 

Met Gin Pro Gly Lys Pro He Thr Ser Met Asp Ser Ser Pro Val Thr 

35 90 95 

Vai Ser Leu Pro Asp Cys Gin Gly Ala Leu Tyr Thr Ser Gin Asp Pro 

100 105 , 10 

Val Tyr Ala Gly Thr Gly Tyr Thr Ala lie Asn Gly Leu He Ser Ser 

13 120 125 

Glu ?ro Gly Asp Asn Tyr Glu His Trp Val Asn Gin Ala Val Val Ala 

Phe Pro Thr Ala Asp Lys Ala Arg Ala Phe Val g" Thr Ser Ala *so 

T 150 155 160 

-ys Trp Lys Asn Cys Ala Gly Lys Thr Val Thr Val Thr Asn Lys Ala 

165 1 7 0 

Lys Thr Tyr Arg Trp Thr Phe Ala Asp Val Lys Gly Ser Pro Pro Thr 

185 190 

He Thr Val He Asp Thr Gin Glu Gly Ala Glu Gly Trp Glu Cys Gin 

I 95 200 205 

Arg Ala Met Ser Val Ala Asn Asn Val Val Val Asp Val Asn Ala Cys 

Gly Tyr Gin He Thr Asn 
225 230 

(2) INFORMATION FOR SEQ ID NO:285 



Gly Tyr Gin He Thr Asn Gin Ala Gly Gin He "a Ala Lys lie Cys 

235 2 4o 



(i) SEQUENCE CHARACTERISTICS 
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(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:285: 



Asp Val val Glu Ala Ala lie Ala Arg Ala Glu Ala Val Asn Pro Ala 



5 io IS 

Leu Asn Ala Leu Ala Tyr 

20 

(2) INFORMATION FOR SEQ ID NO: 286: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 174 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 286: 

Leu His Pro Ala Gly Ala Thr Asn Gly Ser Gly Gin Leu Ala Leu Pro 

10 is 
Val Glu Ala Pro Pro Arg Ser Val Pro Ser His Gly Glu Pro Leu Gly 

2 0 25 
Ser Ala Ala Pro Glu Gly Leu Glu Gly Glu Phe Asp Asp Arg He Asp 

Glu Arg Phe Pro Val Phe Ser Ser Ala Ser Leu Ala g!u Ala Leu Pro 



Gly Pro Leu Thr Pro Met Thr Leu Asp Val Gin Leu Ser Gly Leu Arg 
Ala Ala Gly Arg Ala Mec Gly Arg Val Leu IL Leu Gly Gly Val 111 

Ala Asp Glu Trp Glu Arg Arg Ala He Ala Val Phe Gly His Arg Pro 

100 105 110 

Tyr lie Gly Val Ser Ala Asn U e Val Ala Ala Ala Gin Leu Pro Gly 

120 125 
Trp Asp Ala Gin Ala Val Thr Arg Arg Ala Leu Gly Glu Gin Pro Gin 
130 135 



140 



Val Thr Glu Leu Leu Pro Phe Gly Arg Pro Gin Leu Ala Gly Gly Pro 

Leu Gly Ser Val Ala Lys Val Val Val Thr Ma Arg Ser Leu "° 
16 5 170 

(2) INFORMATION FOR SEQ ID NO : 287: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 61 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 287: 

Val Gly Val Val Gly Val Gly Ala Thr Ser Pro Ala Gly Ala Gly Ala 

Gly Ala Gly Ser Ala Gly Thr Gly Ala G^y Ala Gly Gly Gly Thr 

Lys Gly Arg llm Ser Ala Ser Ma Leu Ala Ala Pro tlu Ser Thr 

40 45 
Gly Leu Leu Ala Val Pro Ser His Thr Thr Asn Gin Arg 
50 55 60 

(2) INFORMATION FOR SEQ ID NO:288: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 133 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 288: 

Met Ala Asn Thr Gly Ser Leu Val Leu Leu Arg His Gly Glu Ser Asp 

10 nc 

Trp Asn Ala Leu Asn Leu Phe Thr Gly Trp Val Asp Val Gly Leu Thr 

Asp Lys Gly Gl n Ala Glu Ala Val Arg Ser Gly Glu Leu Ala Glu 

40 

Hi. Asp Leu Leu Pro Asp Val Leu Tyr Thr Ser Leu Leu Arg Arg Ala 
lie Thr Thr Ala His Leu Ma Leu Asp Ser Ala Asp Arg Leu Trp He 
Pro Val Arg Arg Ser Trp Arg Leu Asn Glu Arg His Tyr Gly Ala Leu 
Gin Gly Leu Asp Lys Ala Glu Thr Lys la Arg Tyr Gly Glu gL Gin 



105 110 



Phe Met Ala Trp Arg Arg Ser Tyr Asp Thr Pro Pro Pro Pro He Glu 

Arg Gly Ser Gin Phe 

130 



120 125 



(2) INFORMATION FOR SEQ ID NO:2B9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 63 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:289: 

Pro Gly Ser Phe Ala Arg Thr Lys Pro Pro Gly Arg Thr Ala Asp Ala 

15 10 15 

Pro He Arg Cys Arg Asp Ser Arg Gly Thr Ala Gly His Arg Ala Leu 

20 25 30 

Asp Glu Pro Pro Pro Arg Gly Ser Glu Pro Ala Arg Arg Arg Ser Arg 

35 40 45 

Gly Val Arg Thr Val Val His Asp Ser Leu Ala Ala Arg Arg Val 
50 55 fio 

(2) INFORMATION FOR SEQ ID NO: 290: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 72 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:290: 

Gly His Gly Gly Gin Ser Ala He Gly Leu Gly Gly Gly Ala Gly Gly 

1 5 io 15 

Asp Gly Gly Gin Gly Gly Ala Gly Arg Gly Leu Trp Gly Thr Gly Gly 

20 25 30 

Ala Gly Gly His Gly Gly Ala Arg Arg Trp Tyr Arg Gly Pro Thr Ala 

35 40 45 

Ala Arg Ser Gly Arg His Gly Arg Arg Gly Tto Arg Arg Trp Ala Asd 

50 55 so 

Arg Gin Arg Arg Gly Arg Arg Arg 
o 5 70 

(2) INFORMATION FOR SEQ ID NO: 291: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 74 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi ) SEQUENCE DESCRIPTION: SEQ ID NO: 291: 

Asp His Arg Arg Arg Ser Leu Ala Ser Leu Arg Ser Ala Ser Ser Pro 

15 10 15 

Ala Arg lie Thr Glu Val Arg Pro Cys Thr Pro Leu Leu Glu Arg Ser 

20 25 30 

Ala Pro Gin Ser Gly Ser Arg Asp Pro Phe Arg Pro Trp Pro Ala Asp 

3S 40 45 

Ala Gly His Ala Arg Ser Pro Ala Trp Tvr Arg Leu Gly Ala Gly Asn 
50 55 so 
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Pro He Pro Val Arg Ala Ala His His Glu 
g 5 70 

(2) INFORMATION FOR SEQ ID NO: 292: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 174 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 292: 
CCGCACGTAA CACCGTGAAT TGAAGGGAGC CGCTGGTCAT GGGCCGATTC TATCCGTGGG 

S^SS CCGCTGCCAC "AGTGgJS SSgcS? 

- TTCACGGCA AC.AACGGCG GACACACCAC TTGACATTCG ACAGCACGGC CGCG 
(2) INFORMATION FOR SEQ ID NO: 2 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 404 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:293: 

-~G^~C SSSS ^ CCGGTGG CGCTAGAGAG TTTGTCGCAC TTTCCGGTGA 
™ ACCGGTGAGC TCACGCTGCT AGTGGAGGTG CTCGACGGTG 

^X^ GCAC GAT ^CGCCC GAAAGCCTCG GCAGGCGGGT GCTGGCTGTG TTACAGCGCT 
SSS AcS™' CCGCTGCGCG ACGTCGACAT TCTGCTGGAC ScSgSS 
ACCCGGCCTG CCGGATGTGA CGACGTCGGC ACCCGCGGTG CATACCCGGT 

""GAC^TACCG SSSS CGGTGGCGGT CAGTTGGGCG SSSSSS 

* GAC^jTACCG GGAGCTGGAT GCATTGGCCG ACCGGCTGGC CACT 

(2) INFORMATION FOR SEQ ID NO: 2 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 134 arnxno acids 

(B) TYPE : ammo acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 294: 
Ala Asn Gly Val Thr Phe Arg Pro Val Ala Leu Glu Ser Leu Ser His 

Phe Pro Val Thr Val Ma Ala Hxs Arg Ser Thr Gly Glu Leu ^hr Leu 

2 0 "i 



60 
120 
174 



60 
120 
180 
240 
300 
360 
404 



30 
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Leu Val flu Val Leu Asp Gly Leu Gly Thr ^ ^ ^ ^ 
Leu Gly Arg ^ val Leu ^ a £ ^ ^ £ sfir ^ ^ 

Asp Arg Pro Leu ^ ^ £ ^ ^ ^ ^ £ ^ ^ ^ ^ 
Pro Thr Ala Pro Gly Leu Pro ^ m ^ £ ^ ^ ^ ^ £ 
His Thr Arg Phe Ala Glu He Ala Ala Gin Pro ^ Ser £ ^ 

Val Ser Trp Ala Asp Gly Gin Leu Tnr Tyr Arg Glu Leu As" Ala Leu 



Ala Asp Arg Leu Ala Thr 
130 

(2) INFORMATION FOR SEQ ID NO: 295: 

(i) SEQUENCE CHARACTERISTICS- 
(Aj LENGTH: 526 base pairs* 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:295; 



125 



cSgSgIc SSS~ SSS" 08 TGGGTTGTGC "«»CC»C GACGACAAGG 
TGCTGGGCGC SSotSc^ SSSSS TGTTCGCCGT CGCCGGGGTG AAATACTTGG 
TCTTCCAGCA G^SSS f ACTCGGCGCG CCGCTCCGGC AACGAGTTCC 

AGACCGTCGA £££££ ST""* 
AATATCGGCA GCTGGGCGCC Jc~ISc~ ~r™^ CTTCAACACC ATCGGCAAGG 

ssss ssss sssr -™ sss sssss 
«— cSsss sssss sss ;sr ct ™ 

(2) INFORMATION FOR SEQ ID NO: 2 96: 



(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 48 7 base oairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:296: 



SSSS ScSt AT Scc^ C GTCAGTGCAT C — CCA ACGATCATCG 
TAGACGAACC ScStSg CCGCGGGCAT GCCGGTGGAA 

CCCAGGAACA TCgSIccC ^CcS^GA G ^CGGC GCCGACAATG 

CGCGGGTAGG CGACGGCTC SgGCGaS ^ GTAGGCG ^ACGTGCAC ATCTCGCTCC 

TAGGTGATGA TCGCcS^ SS~£c AG^S^ GCM ^ C ^GCAAAA 

t-AGCC«ACC AGCGCAAGCT CACGCAGCGG GACACCGGCG 
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SSSS ££S£ G GCAACGCCGG CCACATCGCT 

AATTGGG " GAAGAGCTGA ACACTCGCCG AACGTGCAAC AGCTGCGAAC 



CCCTTGACCC TTTr*r~~~w< -V..UGTTGAC GTATTGTTCC ACCGGCCCGG 

AAGCC^GCC SScgS CGTCGATGGA ^CGCCGACC ACGACGTGCG 

CCGACGAGAT S^GgS^ J!*™"* CTCC ^GAT TTGGCCCCGG 

CGCGGTCACG CcSS^a ^ GGAGCCT ^GGCCGTCT GGGGGAGGCC AGCGCGGGTT 

ccagSSS SSSS SSSS CCGCTTCGGA «tttgcagg CTGCGTTGC, 

GGCGAGCGCA SatSgc «*™GCCC GTTGGCGCCG CCGTTGTAGC 

cggt-gatgc ATATCGGTGC cca ctcgacc caaccgcgac tccataagcg ACACCATTCG 



(2) INFORMATI ON FOR SEQ ID NO: 2 99: 

(i) SEQUENCE CHARACTERISTICS ; 
(A} LENGTH: 164 amino acids 
(B) TYPE: ammo acid 



420 
480 
487 



(2) INFORMATION FOR SEQ ID NO: 297: 

(i) SEQUENCE CHARACTERISTICS * 

(A) LENGTH: 528 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 297: 

™XC ££££ S™ ™~ CGAGCTTGAG ^CCCGGCGC 
CCGACGCCGG CCATGCGATC SSSSrr SS^™* GCACGCT ^ GAAGGTTTCT 
AG CTGGTCGC GTCCTtScg ATCGATGAAr ^ CTGGACACAG 

TGACTTTCAA GACCGATCAT SSSSSS ZS**™* CCGCTCGC <* CGGCCATTAA 
TGCGCGACAG CATCGgSS SSSSS CCGATGATCC TGAGCTAAGC CTGTATGCGC 
GGGAGCGGTT StSScc SSSS TGCTGGCGGG TTTGGAGCCG GACCTGAAGT 
ATCGGCCTGG GCaS^S I? 3 ™ 008 CCTGGGTGTA CGGCAGAACC 

CATTCCAACA ACCG^IS A^cSS 

(2) INFORMATION FOR SEQ ID NO: 2 98: 

(i) SEQUENCE CHARACTERISTICS- 
(AJ LENGTH: €10 base pairs* 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:298: 

£S££ SSS! — ccrroc „ 

TGGCCCCGCT G TTTAGT r r ^ ™A GGTTTCCTAC GGTGCCGCCG CCCGGCAGCA 12 0 

cgccgSS g™Sgtc SSSSS -^ GTC accggct ™ C ^CATCGC ilo 

CCCTTGACCC TTTr.^™ ^1^?° C ^GTTGAC GTATTGTTCC ACCGGCCCGG 240 
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(C) STRANDEDNESS: single 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 299: 
Phe Asp Gly Tyr Glu Tyr Leu Phe Trp Val Gly Cys Ala Gly Ala Tyr 
Asp Asp Lys Ala Lys Lys Thr Thr Lys Ala Val Ala Glu Leu Phe Ala 

. . 20 25 



Val Ala Gly Val Lys Tyr Leu Val Leu Gly Ala Gly Glu Thr Cys Asn 
Gly Asp Ser Ala Arg Arg Ser tly Asn Glu Phe Leu Phe Gin Gin Leu 
Ala Gin Gin Ala Val Glu Thr Leu Asp Gly Leu Phe Glu Gly Val Glu 
Thr Val Asp Arg Lys He val Val Thr Cys Pro His Cys Phe Asn Thr 
He Gly Lys Glu Tyr Arg Gin Leu Gly Ala Asn Tyr Thr Val Leu His 
His Thr Gin Leu Leu Asn Arg Leu Val Arg Asp Lys Arg Val Pro 
Val Thr Pro Val Ser Gin Asp He Thr Tyr His Asp III Cys Tyr Leu 
Gly Arg His Asn Lys Val Tyr Glu Ala Pro Arg 111 Leu lie Gly Ala 
Ala Gly Ala Thr 155 160 

(2) INFORMATION FOR SEQ ID NO: 300: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 161 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 300: 
Arg Arg Arg Asp Leu Ala Gly Glu Leu Arg Gin Cys He Gin Thr Pro 
Thr He lie Asp Gin Ala Asp Ala His Asp His Arg Thr Gly hL Gin 
His Arg Gly His Ala Gly Gly Ue Asp Glu Pro Pro Gly Glu Cys Arg 
Lys Leu Gly Gly Lys Lys Asp Gly Ala Asp Asn Ala Gin Glu His Arg 
Gin Pro Thr His Pro Arg Gly Arg Arg Asp Val His lie Ser Leu Pro 

Arg Val Gly Asp Gly Ser Gin Ala Thr Gly Gin His Pro His Arg Thr 

8 5 

Gly Arg Lys U e Gly ^ ^ ^ Qly ^ ^ ^ ^ ^ ^ 
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100 



105 



Leu Thr Gin Arg Asp Thr Gly ^ £1 Ile Qly ^ ^ ™ ^ ^ 
Thr Qly Asn Ala Gly His u. A^a Gly His Leu Glu Thr Val Leu His 
Gin Pro Glu Glu Leu Asn Tnr Arg Arg Thr Cys Asn Ser Cys Glu Gin 



Leu " 155 160 

(2) INFORMATION FOR SEQ ID NO: 301: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:301: 
Glu Ala Arg Glu Tyr Glu Pro Gly Gin Pro Gly Met Tyr Glu Leu Glu 
Phe Pro Ala Pro Gin Leu Ser Ser Ser Asp Gly Arg Gly Pro Val Leu 
Val His Ala Leu Glu Gly Phe s« Hp Ala Gly His Ala iL Arg Leu 
Ala Ala Ala His Leu Lys Ala Ma Leu Asp Thr Glu llu Val Ala Ser 
Phe Ala r le ^ Glu ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ 

Thr Phe Lys Thr Asp His Phe Thr His Ser Hp Asp Pro Glu Leu Ser 

90 

Leu Tyr Ala Leu Arg Asp Ser lie Gly Thr Pro Phe Leu Leu Leu Ala 
Gly Leu Glu Pro Asp Leu Lys Trp l" ^ Phe Ile Thr ™ ^ ^ 
Leu Leu Ala Glu Arg Leu Gly III Arg Gin Asn His A^g Pro Gly His 
Arg Pro Asp Gly Arg Ser A^a His Thr Thr Asp HI Asp Asp Arg Ser 
Phe Gin Gin Pro Gly Ala He Ser Asp Phe Gin Pro Phe Asp Leu 

1 C 1 ! 



160 



165 "0 175 



(2) INFORMATION FOR SEQ ID NO: 302: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 178 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

Cii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 302: 

Lys Pro Val Lys Glu Pro Val Pro Ala Leu Pro Pro Val Pro Pro Thr 

15 10 IS 

Pro Ala Leu Pro Pro Leu Pro Pro Leu Pro Pro Val Pro Gly Phe Pro 

20 25 30 

Thr Val Pro Pro Pro Gly Ser Met Ala Pro Leu Phe Arg Pro Phe Ser 

35 40 45 

Pro Ala Pro Pro Ser Pro Ala Leu Pro Pro Ser Pro Pro Leu Pro Pro 

50 55 60 

Leu Val Gly Val Ala Ala Trp Leu Thr Tyr Cys Ser Thr Gly Pro Ala 
65 70 75 80 

Leu Asp Pro Leu Ala Val Ser He Ala Ala Ser Met Asp Pro Pro Thr 

85 90 95 

Thr Thr Cys Glu Ala Ser Pro Ala Ala Ala Ala Ala Glu Leu Cys Arg 

100 105 110 

Gly Ser Cys Asp Leu Ala Pro Ala Asp Glu Met Met Gly Thr Thr Gly 

115 120 125 

Ala Cys Gly Arg Leu Gly Glu Ala Ser Ala Gly Ser Arg Ser Arg His 

130 135 140 

Thr Arg Arg Cys Ala Ala Ala Ser Glu He Cys Arg Leu Arg Cys Thr 
145 150 155 160 

Arg Ser Ser Ser Gly Val Pro Arg Asp Trp Val Ser Pro Leu Ala Pro 
165 170 175 

Pro Leu 



(2) INFORMATION FOR SEQ ID NO: 3 03: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 921 base pairs 

(B) TYPE: nucleic acid 
iC) STRAND EDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:303: 

AATTCGGCAC GARCAGCACC AACACCGGCT TCTTCAACTC CGGCGACGTC AATACCGGTA 50 

TCGGCAACAC CGGCAGCTTC AACACCGGCA GCTTCAATCC GGGCGATTCC AACACCGGGG 120 

ATTTCAACCC ANGCAGCTAC CACACGGGGA CTCGGAAACA CCGGCGATTT TACACCGGCS 180 

CCTTCATCTC CGGCAGCTAC AGCAACGGGT CTTGTGGAGT GGAAATTATC AGGGCTCATT 240 

GGNTGCACCC GGSCTTRCGA ATCCCTCGKG CCAATTCAAC TCCTCNACAA GCTTGCGGCC 300 

GCACTCSAGC CCGGGTGAAT GATTGAGTTT AACCGCTNAN CAATAACTAG CATAACCCCT 360 

TKGGGCCTCT AAACGGGTCT TGAAGGGTTT TTTGCTGAAA GGANGAACTA TATCCGGATA 42 0 

ACTGGCGTAN TACGAAAAGC CGCACCGATC GCCTTCCCAA CAGTTGCGCA CCKGAATGGC 48 0 

AATGGAC CNC CCTKTTACCG GSCATTAACN CGGGGGTGTN GGKGTTACCC CCACGTNACC 54 0 

GCTACCTTGC CANNSSCCTN RSGCCGTCTT TCSTTTCTTC CTTCCTTCTC CCMCTTCGCC 6 00 

GGTTCCCNTC AGCTCTAAAT CGGGGNNCCC TTTMGGGTTC CAATTATTGC TTACNGSCCC 660 

CCACCCCAAA AAYTNATTNG GGTTAATGTC C CTTMTT GGG CNTCCCCCTA WTNANNGTTT 72 0 

TCCCCCTTNA CTTTGRSTCC CTTCYTTATW NTGAMNCTNT TTCCACYGGA AAAMNCTCCA 78 0 

CCNTTYSSGS TTTCCTTTGA WTTATMRGGR AATTSCAATY CCGCYTTKGG TTMAANTTAA 84 0 

CYTATTTCNA ATTTTCCCGM TTTTMMNATR TTNSNCKCGM KNCTCCNRKA SSGNTTTCCT 90 0 

CCCCCYTTSS GKTYCCCCRN G 921 
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(2) INFORMATION FOR SEQ ID NO: 3 04: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1082 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 04: 

AATTCGGCAC GAGATANGGG CGCACCGGGG TCCGCAGCCG GCGGGACCGT CGCCAGCACC 
ACCGGGGTCA ACAGCACCAC GGTGGCGTCC ANGCAGAGCG CCGCGGTGAT GGCGGC-GAG 
ACGGCRAACA CCTGCCGTAG CAGTCGGTGC GACTCCGCGC TCGCTCGANC CATGGC"3CG 
CCGGCTGCCT CGAACANGCC TTCGTCGTCC ACAGCTTAGC CAGCANCCAA ACCGCACCCA 
GAAACCCACA CGCCCGCZOC CCCGGANACC TGCGCCATCG KCTGCTGGGG CGANA— CC 
CGATCGCTNA CANGATGACC GCTGCCGGAA CGCCGCCGCT GCCTCCGGGC AGCCGCGTGG 
GCSGGGCAAC CGCGAACCCA NGAACACGGC AAGCAGTATC ANCGCAACAG CAATTGTCAA 
GGG CTAAACG CTTCACATCC AGGGATCTCG CGGCGCCACA CCGTCGGMTC TGCAGSGCGA 
CCCCNTCCTN GGGCGGNCAC TOJTCAAAGA TGCNGATCNA CAGKCTAGGT CTTCGGCCGA 
TATGSAAGGN CCCAACGGNT TTAAAGCGGC SAAAAAASTC TCCCANTGGA TAAAATCAGC 
CGGGGANCCC CCCGTGSCMM NGTCYCGGKC ATTNTTCAAC MGGTTTNACG GCGGKTGCNG 
GCCAACTKGC CAAAMTTAAG KTNGGGGNTY CGGGGCGGTA ACCGGCNNTK NGCCCCTTAA 72 0 
AAAACCGGNC YTTTCTKGAT TAMMACCGGN CCCCCAWTGG CGGKTGKTCC CANGNT^AAC 
AMCCYCCCSS MNGGGXTGGS SAACCCTTCC CGNGGGGTTC NTKGTTSCYT AWMCCCCCGG 
AAACCSGKYG GGXTGGCRTN WASSAMNCCC CMNGYYTCTT TAAAGGCCAN KNRAAWGKYT 
CCTTGGGAAW CCTNCAATYC GAAAAYYCTC CTYMMGSSCN CTTXCWRTYN NRNGGGAACS 
AMWTNYCCNC GWTTCAWTC3 GGTCCGASMN AAACXCTTTY TTTTYCGSSC STCCMGGSNC 
3 GGTKNANAN AAASATTTMC YYCTNNANKK YYYCSSGCTT CYKMGRRNRR GMGAACCCGR 
GS 



(2) INFORMATION FOR SEQ ID NO: 3 05; 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 990 base Dairs 
(3) TYPE: nucleic acid 

(CJ STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:305: 

AATTGGCACG AGTGATCGCG CTGAAGCCGG TAGCGCGGGT GGCTCGGGTG GTTTGCGAAC 
RAAATCCGCT CGANGTGGTC TCGGTAGGCG GTGTCCANAA CGGTGGCGCG GTGCCGGCGG 
ATCTGATCGG CGCGGCCGTA GTGCACGTCG GCGGGCGTGT GCAGTCCGAT GCCGGAATGC 
TTGTGTTCGT GGTTGTACCA GCCGAAGAAC CGGTCGCAGT GCACCCGGGC CGCCTCGATC 
GACTCGAACC GTTTCGGGAA ATCGGGCCGG TACTTGAAGG TCTYGAACTG GGCCTCAGAC 
AACGGGTTGT CTTGCTGGTG TGCGGGCGTG AGTGCGACTT GGTGACACCG AAGTCGGCCA 
NCANCAATGC CACCGGTTTG GAACTCATCC ACAACCCCCG TCCGCGTCMA GGTCACTTGT 
NCGGCGCTAA TTTNYTGGGC GGCAAGGGTT TGCCGAYCAN KCCGCTCGGC CAAAAC~rC3 
ANTCNCSCCA AGGCCNCCAT CCNCCCAAAC AMGTTAC3GG ANAAAANATY CAAAGAYCAC 
CYTCCGGKTN TTATANCTYC C CYTTTG S TY GGGCCCCCCN C YYT G KXN AT ACCCC~MCC* 
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AWTCCCAACN CCCKCCAANA RCYKGGGGCC CCCNCCAACC CGGGXGAAKA WTAATTTAAA 
ACTWMMNACC <*™™** AAMCGTYYNR AGGTTTTSCT 

ANTCGGAAMC CGGNTSTACC AAAAASCCCK CCNWTCCCTC CRASATTGSC NCCSAAWKSA 
TCSGCNWNNC CSGC GGKKKT KKGTTNCCCT WMRCWMWYTS GGCCNASCCN 
CCCCTCCCCM "CCGNKTCC CCAMCCYANC MGGCCCCYTM G^S^ 900 

YKGCCCCCCC AMMNNNGGGG WGACCCTNGG CCCCMKRRGM TCCCNANTGA MCCTCWGNRA 960 

MKCYCCNRAR ANMCCSCNCC NGCNCRCKNN MCCTCWGNRA 960 



(2) 



INFORMATION FOR SEQ ID NO: 306: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 223 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

Ux) SEQUENCE DESCRIPTION: SEQ ID NO:306: 



T~nrrrr-rrn wvrv^v-uuuuLr ^xutiUACGG CGGGTGGT3 

- x CGGCGGCG GAAGGAACGG CGGGTCCGGC GTCANCRGCG GGGCGGGCGG AAATGCCG 

(2) INFORMATION FOR SEQ ID NO: 308: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 104 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 



660 
720 
780 
840 



990 



60 
120 
180 
223 



AACGCGGG CCTGTTCG ^ AACGGCGGCG CCGGTGGTGC CGGTGGGGCT 
^ G I^ G CCGGCGGCGC <**™C**MC GCGGGGTGGT TTGGTCATGG GGGCGCTGGC 
GG 5^ G ^! G GTGTANGTGC ^CCGGGGCC AACGGTGCTA CGCCCGGTCA GGATGGcS 
GCTGGTGTTG CCGGGTCGGA CRACRCTCGT GCCGCTCGTG CCG ^"-^G 

(2) INFORMATION FOR SEQ ID NO: 307: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 418 base pairs 

(B) TYPE: nucleic acid 
(CJ STRANDEDNESS: single 
ID) TOPOLOGY: linear 

(ii) MOLECULE TYPE : Genomic DNA 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 307: 

^JISSSS GANGCGGCAA CGGTGGCAGC GGCGGCACGT CNGTTGCCAC CGGGGGGGCC 60 
G^AACGGCG GTGCCGGCGG CGCCGGCGGC GGGGCCGGGC TGATCGGCAA CGGCSGCAAC 1^0 

SEES cgatgccccg ggcggcaccg gcgtcngcgg ^ = - 

CTGTTGTTGG GTTTGGACRG CGCCAACGCC CCGGCCAGCA CCAACCCGCT GCACACCGCG 240 

?'SSg? ^ CCGCA gtcaagggg ^ ccatccaggc cgtgaccggg SS££S 

£2S?S GCRGGCACGG CGGGTGGTTG 



360 
418 



{ii> MOLECULE TYPE: Genomic DNA 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 308: 

AATTCGGCAC GAGGGGCACG ATCGCATACA GCGCTCGCGG CAGACCCGCC CGATACAGCA 
GCTCGGCACA CGCGAGCGCA CAATACGGCG TCTGGCTGTC CGGCTTgSc SS 
TACCGGCCAC CAGCGCGGGC ACCGAGTCCG ACACCGTAAG CGTCAtSS £5££S 
CCCCACCACG CCCTTCGGTT GATAGCACAC £££££ 
CTGTGCCTTA CGGGGCTTCA GCAGGTCCAC ACAGACTCGT GWtSSS 
TNCGCSTTCC GCGATCAGAT CGACAATTTC CTCTTGCGCC GCCCATCGGG CCTt2££ 
RGGRA ^ C ^ TGAAGAACTC GCGGTTCTCG ATNaSS cSgSSg 

SSSS CGATOACGGG * c ™cca gtcggtctgc gcISS 

CTTCCGCGAA TGCCGCTTCG ACTTCCGCGG NCGTGCCAAC GGAATCNTAT CACGGGTTGC 
CGGTTAAAAC TCCTCAATST NCYGGTCGAA ATTCGGCAAC TTCTTATCCC SSSS 
AACSANNCAA ACCTCGGCAA GGTTAGGMTT TCCCCCNCTT YCAAAAaSc 
CMAATTTCGC CKCNATGKTG MCAAGGMTCT CKAANAAKCS GGGTCYTCTN SSSS 

CCAAAMGGKT ttggggmagc gknmnccaan cctwaccctg ktkaSS tSSSS 

SSSS NCCCRGGGGG GNMCARATTC ssss 

(2) INFORMATION FOR SEQ ID NO: 309: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 103 6 base pairs 

(B) TYPE: nucleic acid" 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: Genomic DNA 
(Xi) SEQUENCE DESCRIPTION: SEQ I D NO :309: 

-SS^ SS^S AGAATCCCGG ^ggtgaag cctcggtgcc tgccgttacg 

..AAGAKTCA GGG.G^GCGG CCCCCCGGTG GGAATGCTGA SGCCAACCGG GAAAAGGGTG 

TGGAATAACT ^ANGTTACT GGGATGGAAA ACCCGGTATT 180 
SSSS SSSSS ^ TWTO GGCTGAGGGC GACCTGTTGG ArT^I 

2^5! ?™™« TTSAANTTTT GTGCCGSCCA 



60 
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60 
120 



300 



2S££ ~£ SSS 360 

at^gI?cg SSS^S TGGGCGGGAC ^cmasggt ccsangtaak GGTTTCCTTN 

Y^SSSS rSSIS- ACTMTSTCGA TGSGCTCSAY MTSATSGCCC NACNCCWCCG 

ATBAMRGGAA ^A^TCCCT CCCMGGAAAA ACCAACMSGC 
CC.GGTNSYC CNCCCRCCNC AKAACCCRTT KCTGTRSTMC CCSMAAATNA CSCCCSCTTS 
ScccSS jESSS CCCSCKNNTT ATSTYCCCGK GTTCCCCCMC ££55£ 
ACCCCCWTST SNCNCCCCCS VTAAKMNCRG GCTTSTTNCT CCCCCYTRMK 

£££££ cST™ CTC:<AACNAC CCCKCVKGSM TNCCCAATNT WCMWcS 780 

KTtNTMCTKC CCAAYTNCRC CCNCRCTCCC CCKSTSTCAM WTATAAAACC WCWYAWYNNK »4n 
SSHE* HBACTCOT NCCCCNCNCK NTTKTAMWCC CXMCC^SW Sc"£S£ 

mS^S WSCCCKKW NKWMCCCTXC CCCCCCTCCC MCNMBMKTCT YCSOTWBK M0 

ssSs" set* crntctcccc ccwccccccv -™ 

1036 

(2) INFORMATION FOR SEQ ID NO: 310: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 1036 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:310: 

AATTCGGCAC GAGATCATGA ATAGCGGGCT GGTCAGCACC GAAGTGGTCG GCGATCTCGC 60 

GAGCAAGTCT CGTCTGCTCG CCCAGCAGGA GGTCGGCATC GATGCGGACA CCTGCGATGT 120 

CTTGGATGGT GTTCAGTTGC AGGTAAGGCC GACGCCGCAG CTTTGCTAGC AGGGTGTCTT 180 

GGCTCTTCGC ACGTGAGGTA ACCAATAACT CCGACGCAGA CCAACTCCGG CCCTCGATCC 240 

GGGTACCAGG CTCCGCCGGA GCCAGCCGTT GTGCCCCCTG GGCCGAAGGT CAGCTGCTGT 300 

GCGATCGAAG TA AGAAA CCG CGCCATGCCC GTCGCCAAGT ACGACTGACC GAGCAAACGA 360 

ACGATCGTC3 TCCTTTCCGT GGGGGTAATC GANCCCAGCA ACCGCACGAG CCACCAATCA 42 0 

TTGGGATTCG GCCACTGACC GACCAACCGC CTGTGCGACA CCCCAGCGGA ATTGGTGGTC 48 0 

TTCCGCGGGG CCGCNAACGG AATCANCGSG ACGCGCTCGC CGAASCANCC GCATANCCNT 54 0 

ACATANCAAC GGNNTCTGCG CCCACATTTC GGGSTTMTGC CCCTCNGCAA CSSNAAYNCC 60 0 

CC CAATTCYG AACNAAAAAA TTGGYCCATY ARNGTYCTCM CCAAAAACCN AWTCCCCKTA 660 

TCCCCCGGGG GGGRCCCCYY NMNAAAACGG CCCWWAANCC CCSGGGCSCC CGGGTTRWTN 720 

CCCCTTGTCG GCCCNCCSGG TTTGGTCMCM GGSCMMTNWN GGGNTGCSCC CCCNCNAAAA 780 

AAAAAYCKNG NCAAATYAAA CCCXYCMAAA ASKTGGGSSC CCCMARCCGG GGKAAKKWWA 84 0 

ANTTAANCCN KAAAAAAAWW NCANNMCCCC NGGGNCCTAA GGKYTTAGGG GTTSTTNANG 900 

ARAAAATMTC CANATMNSSX TTNNAAAAAA ASCCSWAKCC CCCNNNKKNN CCAAWKAARR 960 

SRCCTTCGGG TNWNSGGGGG KKiOGCTNCMS KMNMMTTWGR CCCNCCGCCN NNTWKCCTTN 1020 

TCCNYGGNGC RNCAGN 1036 



(2) INFORMATION FOR SEQ ID NO: 311: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1060 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:311: 

AATTCGGCAC GAGTCGATTC GATCGAACAC GCCCGCACCT GGCCAGGCCA CATGGGCGCG 

GCCATGGCCA ACGCCTACTC GGCCAACCCG AATCCATTCG GCGTCTCACC GCAACCCCCG 120 

AAACCGGCGA CCGCGGCATG GATCAACCCG CCCACCCCAG ATCCGAAATA GCGTCCACAT 180 

AATGAGACAC TGGCGCAAAG AGCTTGACAG GCGCCGCACC ACGCAAGCTG TTAGACGTGT 240 

CGGTCTTGCA AGAAGCGGGT TGGCCACCCA AGATCACGCC GCCCAAGGGC ATCGAGTCAA 3 00 

CGTTGCGGTG GTATCGCGCT AACGTCGGCG CCGCCAAGAA ATGACGGTGC GCATTACCAT 360 

GGCCCTGCTG ATCACCTTTG GCCACCTGCG CACCANAACT ATGANCAGCC TTATGCCGAG 420 

TCTCGTGGAC ATCGGCAGCC GCTTCAAAAA CTCCTTGTCG ACAATSGTAT TGCTGANCCG 48 0 

CCGAATTCTT NTRCTTGCAA SAACACTNCA TGTTNCSGGT NAACAACCYT GGTTNGAAAA 54 0 

ACANCCAATA TTGAANTCCC ANTCGGGCAM GAACCNGTTM CGGAAGKTGK TGGGAACGAA 6 00 

TGKTGCCCAA AAATCCCGGG NGGTRAAAWW CCCNSNATGG MSAATTTTSC CTNGAACAAM 66 0 

AAAAGGTCCA AGKYCAAAGG NGCCCCCCCC SGNAAATTGG TGAACSCAKA WYANRTTCCC 72 0 

WWWTNCAAAT MTTNGGGTCC KNNTCCCCWT AAANGGGSCN CCCCNCCRGG GMGTYTCCCC 78 0 

NWNMGGGMGN CYYCSCCCCA AAAAAAAMMM MTTTCSGKGG SMGGKXCCCC CCSGGTYWGG 84 0 

GKXYTTAAAC CCGGKGGGTX CAAAAAANAN ACCCCCCAMS NGGGGGGAAA ATTTGNAAWT 900 



50 
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AAGGKKKTKC SCMACCCCAA AAANMMNNCN AWNCCCGMGK SARGGGGRNY TTMKAGGGMG 960 
GNYCCCCCCW YCGGGGGGNA NAAYAAAAGK NGSNGRGAAT NTTNTTTTGK RSSSRNKT-T 1020 
TYNTCCTYCN CCNMGNRWWG SRAMNTGKTS NSSGGGSGGC iq60 

(2) INFORMATION FOR SEQ ID NO: 3 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1040 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:312: 

AATTCGGCAC GAGCTTCACC AAAGAGCTGA CATGCCGGGT GATGCGACAT CGCATCGAGG 
GCAATACGGG CATGGATGAN CCGAANGGAN TCTGGCGTTC GCTCAACTGG ATTACGGT~~ 
CCAAGGTGAA ACGCTTTGCG GCGAAAGATG CGACGCTTAA CTTGCGCTTC CACCGTGCaI 
TGTTNGTATG GATGCTGGAA CCGCGCTGAC NGATAANGAA TTCGCTGGTC GCCGGGCACN 
ATGGATGGTC CKSTTTTCNC TCCGCSGTTA AATTGCSTGT GCATCATCTG GCAGGCTATG 

TTCCCGCTAC RCTGCAGCCC ATCATGGATG TGCGGCTAAC GAANAAGTTA TGACATGGCG 360 

CAAGCGAMTC GGGCATSCNC GCGGCAMTTT CGCAACCTGC TGTGTNTGAA GCGTMTCAAC 420 

CGAATGCGGC GCTYAAAAGC NGGCTTGCGT TGATTMMAAC CNAACCCNTN CNATYCTTTG 48 0 

CCGNGNMNTG CGTTCTCTCC AACTCCGKKG SYTGCCNCCG TGAAACCCMA CTNCCCCCCC 540 

GTTGGACTTA MRTNTTCAAA AAMCGGMTNA ACCSGAATNN SAACCTNCCR TCAAANTAMM 600 

SAANTCGGGC TTYGGGNRCC CCCCNGAAYW TTCKNCNGGG GMNNTYCTCN GGTTYNGGCG 660 

SAAACNTTTG CCRTNCYMNN TTTACAMGGC NCMTNMTTGM GGGSCSNNAS GWCCCGGGKK 720 

TNTTTNCAAW TCNCNSKTTT TTKGGGGGGG GGCYGRTRMC NCGGGCCCCC GGCCCKKMAA 780 

AAAAAMCMSA RRCCNCYGGG XKCCCCCCCM NNATNGGGCG YKCRAAACAA ACCCCAANRA 840 

TNGNGMGGGC 3MACCSGNGN 3YNAAAKGGT TSNSCTMANM MKGMANNNCT SGMSCCMNSN 900 

NCTGMGGGKT TTKGNNGARN AANAMKMGGM RCGGNCGCNN GAAAGGGSMS GSCKSCNNGN 960 

NGASNGWMGN CRNNGANRCC NCNGYGNMRN NNGNNNGNNN GGGRKNNACN NMKMCAWSMC 102 0 
NSNMMGNNNS CGYMTNKCGC 



SO 
120 
180 
240 
300 



1040 

(2) INFORMATION FOR SEQ ID NO: 3 13; 

i'ii SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 348 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE; Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3l3: 

AATTCGGCAC GAGACAANGG CGTGAAATGG GATCCGGCCG AGCTGGGGCC CGTCGTCAGC 60 

GACCTGTTGG CCAAGTCGCG GCCGCCGGTT CCGGTCTATG GGGCCTAGTT ATCTGCGCCG 120 

AGCGTGAACT CAGGGCGAGA TTTCGGCCGT TTTCTCGCCC TGGCTTCACG TTCGGCGAAG 180 

TKGGGAACGG TCAGGGTTCG CAAACCACGA TCGGGATCGT GCGGTCGGTC CAGGACTGGT 24 0 

ANTCCTGATA CTTKGGTACA TCGTGACCAA CTGTGGNCAA TATTCGGCGC GCTCCTCGTC 300 

NGTCGCGTCC CGCGCGGTAA GGTCCANCAC TTCCTTTTTC TCGTGCCG 348 

(2) INFORMATION FOR SEQ ID NO: 314; 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 332 base pairs 

(B) TYPE: nucleic acid 
CO STRAND EDNESS : single 
(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: Genomic DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:314: 

AATTCGGCAC GAGAGACCGG GTCGTTGACC AACGGACGCT TGGGCGCGGG CCCCTTGCGT 60 

GGCATCAGCC CTTCTCCTTC TTAGCGCCGT AACGGCTGCG TGCCTGTTTG CGGTTCTTGA 120 

CACCCTGCGT ATCCAGCGAA CCGCGGATGA TCTTGTAGCG CACACCAGGC AGGTCCTTCA 180 

CCCGGCCGCC GCGCACCAGC ACCATCGAGT GCTCCTGCAG GTTGTGGCCC TCGCCGGGAA 240 

TGTACGCCGT GACCTCGAAC TGACTCGTCA CTTCACGCGG GCAACCTTCC GAAGCGCCGA 3 00 

GTTCGGCTTC TTCGGAGTGG TGGCTCGTGC CG 332 

(2) INFORMATION FOR SEQ ID NO: 315: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 962 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

Cii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 315: 

AATTCGGCAC RAGTCGGTCT AGACGGATTC AATGCTCCCG CGAGCACCTC GCCACTGCAC 60 

ACCCTGCAGC AAAATGTGCT CAATGTGGTG AACGAGCCCT TCCAGACGCT CACCGGCCGC 120 

CCGCTGATCG GCAACGGCGC CAACGGGACT CCTGGAACCG GGGCTGACGC GGGGCCGGCG 18 0 

GGTGGCTGTT CGGCAACGGC GGCAACGGCG GGTCCGGGGC GAACGGAACC AACGGCGGGG 24 0 

ACGTGGGGAC GCGCCCGGCG GGATTTCTTC GCACCGGSGC ACCGGCGGGG CCGGCGGCGT 300 

CGCACAACGG CACCGGCGGG GACGCNGCGC CCGTNGGGCG GCTTCTKGAT GGGCTCCGGC 360 

GGTNACGCGG CACGGCGGCG CCCGGCTCAC CGCCNGTTGG GACGCGGGGA CGCGTNACCC 420 

CGATCTTCTT CCGCNCCCCG GAAACCGCGG GGCCGGCCCC ACATTAKACC CGGCGGNACC 480 

GCGGMCCCGG CGGAACGGNG GGYNTTTTCC AACGGCGGGG CCGCGGAACC GNMGGSTGTT 54 0 

CCTTNGGSGA AGGNCCAAKT CCCGKCTANC YYAATCCCCG ANGGKTGAMC CTSATGSNCA 600 

MYTTMAGGAA CYTNCCCANT KTTSGRACCW CRCCNGGAAA ASRAWNKNGT KGGCAAACNA 660 

NNTNCYTTKN NATTKGGNNA AAAANCCCTY CCWCSGRACT NCCCCCCNGM GRGMCNNTNN 720 

NTTTYGNCNN CCCGGSNAAM RNTTKATTTC NGGGGGNTCN GGGTKMNNNA AACCCCAAAM 780 

MNRNNKCSCA ANGGGKSNGC NKNNMMNSGT TTTYCKNMRA MRNWTYKNKN NTCNGARSRN 840 

NAAMCNNSNK NGKKKNNKAA ARNNTTWKTN KNSCNNNCNN GRRNGVRGGC CKMKG SNMNG 900 

MCWHNAWRNG NNGSNCNCKC NNKMNAAAAA AASGGVNCKS NSMKNKKKKG NRGGGGGGGG 960 

GG 962 

(2) INFORMATION FOR SEQ ID NO: 316: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 323 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:316: 



AATTCGGCAC 


RAGAAGACGC 


CCGAANGTTT 


GCGCTGGCTC 


TACAACTTCA 


TCAARGCGCA 


SO 


GGGGGAACGC 


AACTTCGGCA 


AGATCTACGT 


TCGCTTCCCC 


GAAGCGGTCT 


CGATGCGCCA 


120 


GTACCTCGGC 


GCACCGCACG 


GCGAGCTGAC 


CCAGGATCCG 


GCCGCGAAAC 


GGCTTGCGTT 


180 


GCAGAAGATG 


TCGTTCGAGG 


TGGCCTGGAG 


GATTTTGCAN 


GC GACGCCNG 


TGACCGCGAC 


240 


GGG TTTKGTG 


TCCGCACTGC 


TGCTCACCAC 


CCGCGGCACC 


GCGTTGACCT 


CGACCAGCTG 


300 


CACCACTCGT 


GCCGCTCGTG 


CCG 








323 



(2) INFORMATION FOR SEQ ID NO: 3 17: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1034 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 317: 

AATTCGCAGT GTGTGTGGCG GCGTCCAGAA GAAGATGATC GCGAACATCG CCAGCGCCGG 6 0 

CCAGGCTATG GTGCCGGTGA TGGCCGACCA GCCGATCATC ACCGGCATAC AGCCGGCCGC 120 

CCCACCCCAC ACCACGTTCT GTGACGTGCG TCGCTTGAGC CAAAGCGTGT AGACRAACAC 180 

ATAAAACGCG ACGGTGACCA GGGCCAGCAC CCCCGCCAGC AGGTTCGTGG CGCACCATAG 240 

CCAGAAGAAC GAGATCACCG TCNACGTCAC CCGAGTGCCA ACGCGTTTCG GGTCGGCACC 300 

GCTTCCCGCG CCAAGGGCCG GCGCGCGGTT CGCTTCATCA CCTTGTCGAT ATCGGCGTCG 360 

GCNACCAGTT GAGCGTGTTG GCGCCGGCGG CSGCCATCAT CCCGCCGACN ANCGTGTTGA 420 

GCATGANCAG CGGATGAATG GCGCCGCGGC TCGTGCCGCT CGTGCCGAAT TCAACTCCGT 480 

CNACAACTTG CGGNCGCACT CGAACCCGGG TGAATGAWTG AATTTAAACC GSTSAACANT 540 

AACTACATAA CCCTTGGGGG CTCTTAACCG GTYYTGAANG GGTTTTTTGC TTAAAGGAAG 600 

AACYATTTCC GGATANCTGG CSTTNWTARC GAAAAGGCCC CRCCCATNGC CCTCCACAGT 660 

TTSCCCCTGA ATGGSAATGG MNCNCCYKNR CNGGGNCTTT AACRCSGGCG GGNTTTTGKT 72 0 

MCCCNNCTKA CNTTMMMTGC ARNNCNGGCC SKCCCTTCCK TNTYCCCTCC NTCCCCCNST 780 

TNCNGKTCCC CNNAMNYTNW ACGGGGGGCC YTNGGGKCRM TWTKKTTTGG GCCCCMCCCC 84 0 

MAAANASAAN GGGGKRNGTY CSTTTGGCNC CC CAMAARGG NYCCCCCCAM YTNRRKMCSY 900 

CNNTNKGGNN CTGTNCXNCG GAARAMAMCC KCCCCGNSTS STTNGTYWAG GNRWKGNSRG 96 0 

CCSCCCCGGY MNNNAAYAWN WMNATNCNNS 3TNANMAKKN NNNNNNNSCN WNGNGNNTCN 1020 

SCNSNGGKBC CSCC 1034 

(2) INFORMATION FOR SEQ ID NO: 318: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 3 31 base pairs 
(BJ TYPE: nucleic acid 

( C ) STRANDEDNES S : s ingl e 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 318: 



WO 99/42076 



PCT/US99/03268 



254 



AATTCGGCAC GAGCCCACAT CCGGGGCCGC TCGTTGCATG ACTCGTTCGT CATCGTCGAC 60 

RAGGCACAGT CGCTGGAGCG CAATGTGTTG CTGACCGTGC TGTCCCGGTT GGGGACCGGT 120 

TCCCGGGTGG TGTTGACCCA CGACATCGCC CAGCGCGACA ACCTGCGGGT CGGCCGCCAC 180 

GACGGGTCGC CGCGGTGATC GAGAAGCTCA AAGGTCATCC GTTGTTCGCC CACATCACCT 24 0 

TGCTGCGCAG TGAGCGCTCG CCGATCGCCG CGCTGGTCAC GAGATGCTCG ANGAGATCAC 300 

CGGGCCGCGC TGAGTGCGCC TCCCGCGAGC A 33l 

(2) INFORMATION FOR SEQ ID NO: 319: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1026 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(Xl) SEQUENCE DESCRIPTION: SEQ ID NO: 319: 

AATTCGGCAC GAGATCGTCA CCCTGGCGAC CAGTGCACCC AGGCCACGCC ACCAGTTACG 60 
GCTGATGGGC CAGAAGATGG ACCAGGTGCT GCCCATCCCG CCCACCGCAC TGCAGCTGAG 12 0 
CACCGGGATC GCGGTCCTCA GCTACGGCGA TRAGCTGGTG TTCGGCATCA CCGCTGACTA 180 
TGACGCCGCG TCCGAAATGC AGCAGCTGGT CAACGGTATC GAACTGGGTG TGGCGCGTCT 240 
GGTGGCGCTC ANCGACAATT CCGTGCTGCT GTTTACAAGG ATCGGCSTAA GCGTTCATCC 300 
CGCGCACTCC CCANCGCCGC GCGGCSGGGG CGGCCCTCTG TGCCGACCGC CCGAGCGCGT 
CACTGACGCC ATCTCCGTCG GCGTTAACCC CGTGAGAAGG TGGGTCGTGC GCAAGTTGGG 
CCCGGTCACC ATCNATCCGC GCCGCCATGA CGCNGTGCTG TTCCACACCA CNTSNGACNC 
CCCCCAGGAA CTGGTCCGGC AMTNCAGGAA NTYCGTGTGG GCACCNGCTT CTTCCGKTRT 
GGCYTAAACT TCCNATSTTN CSGCSGGCCT CTGGCGTTNC GNCCGGGCCG NTCTTNCCAA 
ATCGGSMMAA ATCCCCANMC AAACCCCCCG GGTCTTGSGG GCSGGGNGGC GGCCNAWNCC 
AAACCCCCCC NTTAAANTCT TTGKTNCCNN CNCSGGCNCC NCNAANSCAN CCCTTTKGGC 
NCTTCCCCCC CCCAWTTTAA CCGAKCGSCN AAYCCCAAGY TMMGKCCYCY XNAAAAAAAA 
AATTTGSCSG CCCCAANTAA ATTCCCNGGC CCYTTGGGGG CGRANCNYNT TTTMCCSNSS 
TKGNNNAAMC NGGANCCSGG KAAYTMMTKG NAAYCGCCSN AAMBNTTTTC TAANNCCCCN 
YNCCCSGAAA ATTNNAMAAM CMNNKTGSNG GGGGKTTSNC SGKKGRAGGM AAAAAANRSN 
SKTTNMCNNN SANMNCNSNN SGGNSNNNNN NNNCNCGYKC CSNAANMCCC CGCGGGGGGG 
CCMMCC 



360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1026 



(2) INFORMATION FOR SEQ ID NO: 32 0: 

ii) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 324 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 320: 

AATTCGGCAC GAGAAG AC GC CCGARNGTST GCGCTGGCTC TACAACTTCA TCAARGCGCA 60 

NGGGGAACGC AACTTCGGCA AGATCTACGT TC3CTTCCCC GAAGCGGTCT CGATGCGCCA 120 

GTACCTCGGC GCACCGCACG GCGAGCTGAC CCAGGATCCG GCCGCGAAAC GGCTTGCGTT 180 

GCAGAAGATG TCGTTCGAGG TGGCCTGGAN G ATTTTG CAN GCGACGCCNG TNACCGCGAC 240 

GGGTTTKGTG TCCGCACTGC TGCTCACCAC CCGCSGCACC GCGTTGACGC TCGACCAGCT 300 
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GCACCACTCG TGCCGCTCGT GCCG 

(2) INFORMATION FOR SEQ ID NO: 321: 

fi) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 1010 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 321: 

AATTCGGCAC GANGCGTGCC GCTNAACACC AGCCCGCGGC TGCCAGATAT CCCGGACTCG 60 

GTAGTGCCGC CGGTGGCGTC GTTGCTCTCC TGACGGGGCG CGGCGACCAT AAGGTCGCTM 120 

ATGCCCAGGT AGCGGCCCAG GTGCATGGAG TCGATGATGA TGCGACTCTC CAGCTCGCCG 180 

ACCGGGAGCT TGGCATCGGG CCTGATCAGC CAGGACGCGT AGGACAAGTC GATCGAATGC 240 

ATAGTGGCCT CCAGAGTGGC CGTGCAMTTC CNGCGTGCTC CACGGCAAAT GCCTTGATTT 300 

CTACTCCGCG TANTGTTCCC GCATCGCCTG CGGGATGAAT GGGAACCGCA SGATGGCGAC 360 

GAACGGGTCT GANCTCAGGT TTGCCGCTTT GCGCACAGTG GTCNACANCC GGTACTCGGC 420 

ATANATCTGG CCCNAAATCG GCGCCGACGG CGCCCACNAT AANAACGGGC ACNACAATCG 480 

CCGCCCCGGT CACCCNAACA ACANCTTGSC ATCGGATTTT GTCCCCANCG CTCAANCCGT 540 

CCCGAACGCC TCNTCCGGCG NACTTTTCTT NNAWTAACTG CCGCTTCCGK CCCTGGNGCA 600 

WTAAATGGGA AACCCTTNCC CCACCTTGAA GGGGTTGTTG NATTTTTACT GSTAACCCCG 660 

AATTNTTCCG GANTCGGTCN KCCGGGSTTT YSTNTTCCCC ACCTTNGNAN GGGCCGGCCA 72 0 

AGSTTTTCTT SYTGAAGGGG GAAACCCAAC TTTNTYTYYN AACCSCMNAA MYMTTTYCSG 780 

MNAASCCNKT CCCCTTTAAC CAMGGSGGTN AACCGKTMNG NGGKTAAAAA GGGSKNNKTG 84 0 

NCCCCY MANG GGGGGRAAAA TSTKTCNNCG GGGCCKAAAW ACCMMMMYGN GTGKKKNKSS 900 

GCSAAATTTT NMMRAACTKN GGGGCCSSGA NNTTTNAAAG MSCCCCCSNN GSTGKCCCNN 960 

NTTTCCNNAA WMKKGKNWNM SNMNSCSNGG GKYNSGGSNN NNAAGMGGGG 1010 

(2) INFORMATION FOR SEQ ID NO: 322: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1010 base pairs 

( B ) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
ID) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:322: 

AATTCGGCAC GANGCGTGCC GCTNAACACC AGCCCGCGGC TGCCAGATAT CCCGGACTCG 60 

GTAGTGCCGC CGGTGGCGTC GTTGCTCTCC TGACGGGGCG CGGCGACCAT AAGGTCGCTM 12 0 

ATGCCCAGGT AGCGGCCCAG GTGCATGGAG TCGATGATGA TGCGACTCTC CAGCTCGCCG 18 0 

ACCGGGAGCT TGGCATCGGG CCTGATCAGC CAGGACGCGT AGGACAAGTC GATCGAATGC 240 

ATAGTGGCCT CCAGAGTGGC CGTGCAMTTC CNGCGTGCTC CACGGCAAAT GCCTTGATTT 30 0 

CTACTCCGCG TANTGTTCCC GCATCGCCTG CGGGATGAAT GGGAACCGCA SGATGGCGAC 36 0 

GAACGGGTCT GANCTCAGGT TTGCCGCTTT GCGCACAGTG GTCNACANCC GGTACTCGGC 42 0 

ATANATCTGG CCCNAAATCG GCGCCGACGG CGCCCACNAT AANAACGGGC ACNACAATCG 480 

CCGCCCCGGT CACCCNAACA ACANCTTGSC ATCGGATTTT GTCCCCANCG CTCAANCCGT 540 

CCCGAACGCC TCNTCCGGCG NACTTTTCTT NNAWTAACTG CCGCTTCCGK CCCTGGNGCA 60 0 

WTAAATGGGA AACCCTTNCC CCACCTTGAA GGGGTTGTTG NATTTTTACT GSTAACCCCG 66 0 
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AATTNTTCCG GANTCGGTCN KCCGGGSTTT YSTNTTCCCC ACCTTNGNAN GGGCCGGCCA 720 

AGSTTTTCTT SYTGAAGGGG GAAACCCAAC TTTNTYTYYN AACCSCMNAA MYMTTTYCSG 780 

MNAASCCNKT CCCCTTTAAC CAMGGSGGTN AACCGKTMNG NGGKTAAAAA GGGSKNNKTG 840 

NCCCCYMANG GGGGGRAAAA TSTKTCNNCG GGGCCKAAAW ACCMMMMYGN GTGKKKNKSS 900 
GCSAAATTTT NMMRAACTKN GGGGCCSSGA NNTTTNAAAG MSCCCCCSNN GSTGKCCCNN 
NTTTCCNNAA WMKKGKNWNM SNMNSCSNGG GKYNSGGSNN NNAAGMGGGG 



960 
1010 



(2) INFORMATION FOR SEQ ID NO; 323: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1092 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE; Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:323: 

NGNGGGGWNS NTCAYCAYCA YCACSGGGYW CWATTGCGGC CGCAWCTTGT MAASAGATCT 60 

CGAAYTCGGC AMGAGGGAMT CKCTMGCNCC GCTGTGCAAN CCAATRAGGC CTRATAATTY 12 0 

CCACTCCACA AAAAACCGTT GTGTGTAYYT SCCGRAAATR AAGGCGCCGG TNTCAACWYC 180 

GCCGGTKTTY CCRATYCCCG TKTTGTAMCT GCCXGGGTSR AAAYCCCCGG TGTTGGAYCC 24 0 

CCGGATTGAA ACTGCCGGKT TGAAACTGCC GKTTTSGCSA TCCGGKWATT GAMSTCRCGG 300 

ATTAAAAAAC CGGKXTTGGN GCTGSNCGTG CCAAATNCGR AYCCRATAYC CCATGGCCTG 36 0 

KYCTYCTCCK YCGGTACCCA AAYCTGGGTA TCCTATACTG GYCCCTAAAK GCAAWYCKGG 420 

GCTGYCMMTK TTGCKGGSGT CCNAATTTAS CACCASCGGT TCCTTCCATA CCNAAACNCG 480 

CKTGGGCWCC AGMCCGRAAA AAAKAATAAT RAKAAKGGTG CATNYCCAAA ACCNCCGCCN 540 

CCCNANTNCN ATCCGNTNCC MSCNCCCCCA GCGGTNAAGK TKSGGAAYTT CTMMAACCCC 600 

CAAANCCCCA TAACNTNCGR GAASAAACCC CTYCNCGGGG GYCNWNCAAA ACAS CNTTAT 660 

TTGCTKSTTT CGGGMWCCGT GCCGCCNAAA YCCCAAASTA CTTTYTGGGT CCNAGAKAAA 720 

ACCNCGGGCN CCMCCCSNAA NWTATYTCTT KGGCAANCCC CSAAACCTTR TCMNACCNCK 780 

ATRMTCCCTT CCCCVSCAAT TGGYCGGRAT NCGSNCCYTY TCAAAKKKSC CAKWWNNGNG 34 0 

GRRNNACCMA ACCCCAAGTY CCMNAAAATN GKCCCCGCTC CNAACACGNK TYYTCCSAAA 900 

ASCCCWCCCC CCCCCCCRAA AACCCCCCNA RKANTNCCCA AAAACNYNGK GGCCCCCCCC 960 

CAAACMAAAA AMCCCCCSGM RMACSGGGGN NMCCCCGKKK KKTTTTCTTT TKCCMRSCCC 1020 

AAMGCAMWSY KSKTNMAAAA GGAAGRANCN TYCC3ANANM TCCCNYWRSW CCGSWGMGNA 1080 

GAASMCCCCC CS 1Q92 

(2 J INFORMATION FOR SEQ ID NO: 324: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1251 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 324: 

GGGGGGGNNN NATACATCWT CYGTGYACCG GGGMTCTAKT GGCGGGCCGC AATCTNGTCA 60 

ASAGATCTCT NAMTTCGGGC ACAAAAACTW GACAAASYMT CGNGCNMTCC GTGTCCTNKA 120 

TCGCAAAACG NGTRACAS A C ASACACRTAT GTGTGCCCAC CASCAAYTCK TTGGGACCTC 180 

GCTRACCGGY TGCCCRNACG CCACGYTGCS CWTCTATCCC RACGCCGGCC ACGGGYGGGG 24 0 
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ATATTCCAGG CACCACGCCC AGTTTGGTGG ACAATGCCCT GG CAKTTTCC TCRAANTTCG 300 

TGAAACCGAA TTCNSMTTGA ACCNCCAARG CCCCSNCCNR AACARTTGGG WTCCGCGGTT 360 

CTCCCCACCG KTTTCCGGGG GTNTCGGCAN AANCGCACCC WTGGWTTCTM TCNCCGCACC 420 

GGGCGGACAA NTCGGGTTGC AATTTTGCRA AYCGGGGCCG GGATTCCSCA AACGGGTGCC 480 

GAAACTGTTY YCRAAMACCG GGAKCCGCAA TTTCCGGGCR ANAAATTTCN YCNCACCACT 540 

GCTTRTACTT CCCCGACCGT AACMANTTTC ATCGTCNTNN CCTCTGCCCT TGGGGCAGGG 600 

CKAAAYACCG CMTTKGGTTT CGCAACCTGC GGCCCAANTC CCNAMCCRCA CTTTCNATTT 660 

GGNTCGAATT SCCCCCCGGT RANAACCSCC NTGGCCNNYT CGGASSAAAA NGGGCCCTNT 720 

KGGCNSCCCC AGTAANACCC TACCNNAYTS CAWTCTTTGC CAAASTTKGG ACGAANSKTG 780 

GGNTTCCGGK ATTTYYTTGS GGNCNCCCTN TATNGGSNTN GGGCCKCYNC NCSTKTGKCA 840 

NASSKAYCCS NGNKGGGGGT ACCCCCCTMG GGGGGTTTTT NSSGCCCCCC AWAYGNKSTG 900 

GCCCC CNNGG GGAAKAATWT MWWTMCNSGG GGGAAWTTTT NTSTGGAMCS SGGACYCCCR 960 

GGGGGKTTTT TCCCCCNCSA NNAWANGGGG GGGGGANAYT NTGNSGNGGG KWNTTTATTT 1020 

YTYYCYCCTM TKACMSGGGG GTTTKKAKNG GGGGGAGAAA ANAAAAAAAA RAKGGYXNTT 1080 

TSKNCACNCT GKWNWNWANR NAGAGKTCCT CKCKCCNCSG SNTTTCTTTT MGNSGSYGGG 1140 

GNNGNNNAAA ACTXSRMMAC KCSYTYCCCG CGYCTCCTCC NCNGGGGYGS NGSCGNSTYN 1200 

3NNKGRKWTA TNTMGNCGTN SCCTCCNCCC GCKNKNTGTC TMTCNMYGSG C 1251 

(2) INFORMATION FOR SEQ ID NO: 325: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1099 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 325: 



AAYTCGGCAC MGAGTATCAC CAAKCTGYGT GGCCCAGCAA AGTGGAGCTA TTACTACCTG 
TATGTGATCC TCRACATCTY CTCCCGCTAC KTGGTCGGGT GGATGGTGGC CTCGCKTGAK 
TCRAAGGTCT TGGCCRAACG GCTGATCGCG CAAACCCTTG CGCCCAGCAC ATCAKCGCCG 180 
AACAGCTGAC CTGCMCGCCG ACCGGGGGYC GNCAATAACT CCAAACCGGT GGCMCTGCTG 24 0 
CTGGCCNACY CCGTGTCCCA ANTCGAACTC ASCCSGCNMA CCAXMAACKA NAACCGTTGT " 300 
CTGAAGCCCA GTTCAAAAAC CTCAAGTWCC GGCCCRACTT CCCGAAACGG TNCGAGTCKA 
TCRSAGGSGG CCGGGTGCMC TGCAACCGGT TCTTCGGNTG GTRCAMCCCN AAAMCAAGCA 
TTCCGGGMTC CGMMTGCCCA CGCCGCCAAS TTTMCTACGG GCSGSCCNAT CAAATTCGCC 
3GGAACSGSN CCMCCXTCNK GGAMACGCCC TWCCAAAACC CYCGAACGGK ATCCTTCKGY 
NAACNCCCGA ROTCCCXSKT TCCGGGCTTC NMSGCGAATA CCCKNSCMNT CCGAATCCAA 
- i. CCCMKYGG CTTTTYYYCC CCCCGGCCCC AAAYNGGGYC CCTASSNMKC KNCCAMNANT 660 
CCNWATCTGG NGGTCCCNAN KYYGGCGTTC NMAATSAMNA NMNRGGGTYT TSCYACCMMN 720 
AACCGKNNKG KCCCCMKCTK MANAAAKATT RATCAMKWNG GGNKCKCNCN NAAMACCSCN 780 
CNCYNOTYTC TMYCSSKWGC GCSMYNANCA SNGGGGAGGW GGSGRMKMCT CTMTCTCNCT 840 
MGCGCCKNTN TYCKSGAKAT ACASMNKTCC GCGCNGCGCN MAAMANRAKA CTAKCCGYGN 900 
CCSNSTMTYN CTSNNMKMNN TCCWMWNATC NTYYGK3CCNN KCTMKATNWC CSCTSKCNCK 960 
MRAMTCKTYG SNMTCCTCCA TCNCTCKKSC SNMSKNTCKC KSCNCCNCWN CNKCNMKCWN 1020 
GGNSTCRCCY TCTMNNNTCS AGCKCGSKNC WACNCACACK NGWCTYTTCC WKNNMKCNKM 1080 
TCKCKCACRG MTMTCWCCS 



(2) INFORMATION FOR SEQ ID NO:326; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 96 base pairs 

(B) TYPE: nucleic acid 



60 
120 



360 
420 
480 
540 
600 



1099 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 326: 

GNGNTATACA TCWCTGTGYA CCSAGGATCW ANTGCGGCCG MAAKCTWSTM CASAGATCTC 60 

AAAYTCTGCA MGAGCGGCAC AKAKYSTCGT CCMRACCCGG CAYACWCCWG CNCGCCCCWT 120 

CTTRGACCGG GGCKATASMC ACCGTTGGCC CCGGCNCGCA CCTACACCAC CCACGCCGCC 180 

AGCGCCCCCW TRAMCAAACC ACCCCGCKTT TACCGCCCGC GCCGCCGGGG CCACCACCAG 240 

CCCCACCGGC ACCACCGGCG CCGCCGTTGC CAAAACAGGC CCGCKTTTGC CACCRA 296 

(2) INFORMATION FOR SEQ ID NO: 32 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1073 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:327: 

NGNGSGNKMY ATCATCWTTC TGCACCSNGG MTCWATTGCG GCCGCAATCT TSTMNASAGA 60 

TCTCGAAYTC GGCAMGARCA TCTGCGCGGN GAATGTCCAA AWGTCWKTAA CGGCMATCGG 120 

TTTGCCGYCA ACCACKCTRT SCAKATGCGG GCCAMWTYCA AACCRATTAT TTGGGYCGAG 180 

AAAATTTMCG CKTGTRASCA ACCTGCAGCG GGTCAASCAA CAGCCTCTRA ACCGTAAATY 240 

CXTAGGTNKT YCCGGCAACA AS CYCRATAA TSCGGCCCGC AMCCACAAAA CCTGANTNGT 3 00 

TNTTCNCRAA NCCGGTYCCC GRAGGGGTSA ACTGCSGTAR GCTTNTCTYC NCCTTRACAT 360 

TAAACCCCCC CGGNTCWTCG CCGCGCCCAA ATYCYTGCCC WTKGCNACCA YCCCANCCTG 420 

CSGTATGGTS RAANCASTSG GCRAACGGTM MCCSTACCKC TGGCTGATYC XTCGGNTCCS 4 80 

SNAATTCGGG GATTTACGGS CAMGGTTAAY CCAGGYCCCC TNTGCYTCKY CNACAACCSG 540 

ATCMWCNCCG TACCTKTTAA AATTCTTTGT GGTGGAACCC AWYCKAAAAA NMTNTYCCCN 600 

TCCAMMGGGG CYCGGAAKKT CNACNTGGKT NACCCCTNCC YTTGAASTTT TCYTGNCCCC 660 

GGCCCKAAAS ANACCSGAKC CCCGGAAYCS WTAGGCYTCN TGCCCCSTTA AATTKGNCYC 720 

AATCCKCCAA CGCTCCCCGG GGTCSSCCMT TAAAMTTCCC CCCKSCASNG GAATYCYKSG 790 

GCWGTMATTW CCNCCCNTTT CYYGKNAAAC SCCCCCWKGN GSCTYCCCCN SNTTSSGCCS 84 0 

GGTTSGAMYC AAAAWTNGGG MMCNRAGNCG SGNAMCCSCN GKKGGGSATW TKAAYYCYGG 900 

GGGGGTCNYC CCCCRCSNAA AAGYGTKGGC KCCSSSCCYC CCMARTTTYT CNGGMRCMAM 960 

ACCANGGGNG CTCCCGTNCW WGGCTCCCSN SNSMAMAAAN NKCKCCXGGS CKGARRNMNA 1020 

MCTCSNGNGG WTCCCKNKTC NSCNSGNCGS YGGNSASWCC YNYCNCCACA ANC 1073 

(2) INFORMATION FOR SEQ ID NO: 3 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1166 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:328: 
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CGCCCCGTTC TTMMMTTCAY TCATTCACCG GGMTCTAGTG CGGCCGCAAK CTTGTCKACA 60 

GATCTCGAAY TCGGCAMGAS ACAATSTCGG GTKGGGCAAT GTCNGGTGGG GCAACTTTGG 120 

GCTCGGRAAT YCGGGGTTAA CGCCGGGTCT RATGGGTSTG GGTAATATCG GGTTTGGTAA 18 0 

TGCCGGCAGC TACAATTTCG GTTTGGCAAA ATATGGGTGT GGGCAATATN GGGTYCGCTA 240 

ACACCGSCAS TGGRAATTYC GGTATTSGGT NACCGGTRAY AAYCTGACCG GGTNCGGTGG 300 

TTYCAATACC GGTAACGGGA ATGTSGGTTS YYYACYCCGS GSAACGGNWW YTTNGKTCCT 360 

TMMCNCTSSM CCKSAAMTSM KMGGTSTYCT MTYCNNGGAS TAMTYNMCCC CCGWAYCKSC 420 

WAYCCCTCGT CATYCCMCMC SGSGYCCTCA MNCCACCYTG NGYYCCCTCC MKMTCYCAYT 480 

CMNTCCGGTW CCTNTMMNCC CSCNCRYCTC AMCNCTKSGK CACCNATMYC CSACKCHTCT 540 

MCYMCSCAKN MTTCCCCTCN CCTYTNNCCA MCMCSCTCTM TCMAACTCKC CCGGYCKCNC 600 

MYCTCTCKCC AYNMAACCKK TYCYWCNWYC YMYCKCKCAG WYKNMCTCCW ACTCTMYNTT 660 

TCTCTCNKCC CMKACCKNTT CTCWCSCCCC CCACAKAYMC YAWCMTMTCC MCTCKACSCC 720 

CYYCNNYCCM NMCWCMTCWC TWNAKCANCN TTCTTCTCTC MMYMTMACKC WCNNTCNCCK 780 

SGACCYTCTC ACTKMKCCKM TCTCCTTMCK CCYMWCNTCC MKYNCCCTCC NMTCMTCKYT 840 

CCTCNCNMRY CYYYAKCAKC NMCTCCCCAN KMCAKCTKCT CCCCCAKMKS ACNCXCCCWC 900 

CCTCCTATCC WCTCTCWCTY ATCTCXCTCW CNYCMYMKMC ACNCKCYAYT CNACTMNMWN 960 

CCANCNCTCT CTNYCTCWCX ACGTYCXCCX CTMCKCNYMC NRWCTYRCCT CKKCCNCCRN 1020 

CKNMCMKCTM CTCTCCWMKM TCCCWCCCAT CTMMKSTCTC WCNCMTCCCT CNKCCYNYNT 1080 

KCYTYCCMYG CTTCKNTCMT MCCWCCYATC TCTMKCCTCT CWCACYMCAC WKTTACWNCC 114 0 

ACTCTCTRCW CKCCKCMCCR MTCTCB 1166 

(2) INFORMATION FOR SEQ ID NO: 329: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 123 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 329: 

NGNGGNNNNT CWTACATCWN TCTNCACCSG NGMTCWATTG CGCGCCGCAW NCTTGTMNAS 60 

AGAATCTCNN AAYTCGGCAC ANATGTCTTT TSTMTAKTGT GGCGGGGNGC CACGCCKTAT 120 

GTGYGCCTGG GYTRACCCAA CCCCGCGGCS CGGGCCRACC AGGCGGGGRA TSCAGGCCGC 180 

GGCGGCCGCG GCGGYTATAT RAAGCGCCGY TTTTKTRATA ACGGTSCCGC CGCCGGGTRA 24 0 

TTACGGGCAA AAYCGGKKTT TTGGGTRTAT AACGCTAATT GCAACCAWTT TTTYCGGGTC 300 

AAAAACYCGG CGWGCANATC NCGGGYCNCT RAGGCGCATT YMCGCCAAAA WTNTGGGCGC 360 

AAAACCCCKT TSYTATTTTN TGGGCTATSC GGYTGCTTCG GCAAACGCTY CCCGGGTTAA 420 

TCCCKTCCGC GGCGCCGCCN AAAAACCACC AATYCCGYTG GGGGTGKYCC CMCAGGCSGT 480 

TGCTYCGNGY CACCTGGCCA AAYYCCCAWT AKATTGGGTG SCYCKTSCGG TTSYTGGGCY 540 

CAATTACCCC CNCGGGNAAA GRRAAAANAA ATCNTCCNTT TGCTCGGYCA YCTTTMTTGG 600 

SAAAAGGGGC ATGGCS CGGT TYYTTTACCT CAAYCCCCNA NCANTWACCT YTCCSCCCGG 660 

GGGGNCANAA CGSTTNGCTC CGSGGNAKCC TKGTMCCCGN ATCNAAAGGC CNGAATTTGG 720 

TYYSSTYCNA ATTWTWKKKY CCCCWCNTTG YAAAAAKCCA AAASAKCCCK YCNCAMMYKT 780 

NGGGGTYSSG GCCKNYCTTK SNMTTAAACC CYCCCCAAAA YYNSGGGKKT TCCGCYNSAT 84 0 

KCCACCNCCX GNGGGGGGNA SAAAAAAAAY TTTYCCSAAA ATCCCACCYY TCYKTKSTRY 900 

AMACCCCCTT TYYMKKAYTC CKYSCNATTC SGMTTCWAAA TYCCGYGGCT TNTTCCCCCK 96 0 

CSGGNGCCCC AAWTTTGKTT YNCNANTTYC CCCNAAMNCM AWTMGGGGKS KCCATTCTGG 1020 

SCYTMAANTA AAANAANGGG NKTTTYYCTY MANAAACACN GTGKCNCNCN CNAAMAAASN 108 0 

AKMAAAKAGN KKKMTKNNSA AANCCNCCCC CTSTYTNYTT NKTNMNCKCC CYGGKKNKGM 114 0 

SWSWYNTTCT NCCCRCCCCC YNYNKTGANA AAMKNCYCCS GGSTMCRNAN ASNMNTTTCK 1200 

STSTNGMGCC KMB AS NAN AN MCAMWKWYCC 123 0 
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(2) INFORMATION FOR SEQ ID NO:330: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1022 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 330: 



NGNGGGXNNA TMAYCWTCTC ACSSGGTCTA TGCGGCGCAW CTMGTMAASA GATCTCNAAY 60 

TCGGCAMNAN GCATMTCMMC CATATATAAC CATTGCGTCS GYWTGCAWCT CRAAWCTGTC 120 

CTTCSKGCCG TTKTACRAAG GTGGMWTGYT CWTYCCTRAA SCCCTCRATC TCKTKTATYC 180 

CTKGGGCTYC ACTTTAACSG RATKSCTGCC TTKTAYCATT RATGCAAWTA WTGGYCRAWT 240 

KTTGCAGGCC RACGGCWYCT TTTYCCGCRA GRACAATNGA TTGGAWYCGC TYCGCRAGGC 3 00 

CCGGCACCAR ACCGGGCNCC AAAGGYCCGC GCAAWTSCCT GGKTCAAAAA TGGTGCAAAC 360 

AAAMCNATCC CCGGYTTRAC CGCAGYTAMC ACAAKAAAAT TCCCWTGGCC GCACCAWNNT 420 

TTYCRATCWY CWYCCCCACC TTRAACTTGK YTGCSGTATT GCCTKCCTGC CTCRACAGCM 4 80 

YCNCCCKTCA AACCTGCGGT GACTCCAACT GGTCTGGYCG AASGGGGGYT CAMCGGACAA 540 

AACCCCRANN TCGCCAAATT TTCNCCCCCC CYCGGGAAAN GKTGATMTTC TCSNAACCSA 600 

CMGGGNNYTW NAACCCTGAA CSSSGSNKGA MYNSCCSGGA ANTTTTCCCT TYNGGGCGRN 660 

AAANCCTTTT AAGGTACCCC KGGNGGGGKG CCCYYTTGGG AAAACAACCC CXATTGGKTT 720 

TGGAAATNTT TKCNCCCCCA TTCNSGGGGG GGGCCCCAMC CCMMCTTTTN TCMSCNMTYY 780 

YCYYGGGAAT TNYTCGCCSG GAAYYCGGSM CCKGYCCTAA NCCCCMNWGG GKYSTGSNAR 840 

GGRATMAWWT TYSTTTYYMC CCGGCNNCCC CCCKAKMCNT KGNTGAACMA AAAKCSGGGG 900 

GSCNMYMWYY YCNNNGNRTT TNRGGSSNMT TVMAAAMMAN GGGGKYWTYY CKCCNGSCNN 960 

GKTYSGGGST TTTCCNTTTS GGGSSATYKG MACCCCXTMT AYCCGGGGGT NTKTKYCCCC 1020 
SC 



1022 



(2) INFORMATION FOR SEQ ID NO: 331: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 103 3 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 331: 



NNCGNNKNTA TAMAYCWYCT NCACCSGGGA TCWATTGCGG CCGCAATCTT STMAASAGAT 6 0 

CTCKAAYTCG GCAMGANCCG CAWCTATTTG KGTGRASCGC ACCAGCGRGA CCTCGCSGKT 120 

CXTTYCTTGC AGRGAGGCCK TGGGTGGCRC CGGTGGCAAT GCCAACCGCC CCCCAAAACN 180 

CCGCAAATMY CRAAAAACAA CCCSGGGGTA GKTCCSGGCC GCCAAATMAA TAACCGTKTT 240 

AACKCAGGCN ACGGCCAACC GGYCCCGCCC AACCAAGCNA CCTCCCCSCC NATAGGYCC3 3 00 

GTGGGGGCTG CCXTATYXCC AASTCGTCAY CTCNACGGGM CGGYCCMCWT TCCGCCTCAT 360 

CCGTCTCTCC TTMMATTTTC CRTCCACYKG GCGGGGAACY TTTTTNYCNC CCTTGSCMAN 420 

CACCNAAGGY CNAAAATTNC CCMTGCCKYG SNNCAAAYGR GATTGGGGTY CGKKTTTTNT 4 80 

TCNMCCMAAC CCCCNTTTNA CGCCCCMATC CCYTWATACC CCCWWMCMNS ANGKTTGNSA 54 0 

AAKTNNCCCC AAATRCCAAA MTTCTTCGCC NTTTMTWMCY YYCCTTTCCC CMCCCWNAAA 6 00 

GGSCCRCCYY TCGGGAANTY TCCCCNCAAA AWTCAMWCCM TTTCCCNCCA AGAAWTTCSG 6 60 
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SACTCCTTTN TTCNGGGNAM ATANATYYTT YCKTNGGGSK TTCCGMTCNC AMMAATNTCC 720 

RGGGKAAMCC AGKNTNNTCC YYYYCCCCAA NNTYCCYKGG RMCYNNYYCY TTAAANRASR 780 

SAACCCKSGG GKCYNCNCSS TARCCCCCAM KAAAATTTCC CCCSSKTTTC TYYNNKKMRW 840 

GCCCCCSAAM ACTMTWAYTT TCCCKCGNNN TTTSYCCKCS KCAMWMWMTG KKNCTTTTTT 900 

YCSCMATAMA CTTNGGKCCT NTCNYGSGCG CMAAANAAGG CGCGSTTCTN TTCWMAMACA 960 

YNTSGNMMMA SAAKAKWATA AWNNTRKKYK TKNNCCCNCC CKCKCTTSNN TNKCCMCSKS 1020 

GGGKNWNKXR GWCTCCWCNC CKCCCNCKNK CCKWATMCCC CCCCSKCCGM NCMMNTTTKT 1080 



{2) INFORMATION FOR SEQ ID NO: 332: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1069 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

iii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 332: 

GGGGNNKYAT MCAYCWTCTS YACSGGGMNC TATTGCGGCC GCAWYTNGTM GASAGATCTC 60 

GAAYTCGGCA MGAAAAAAGW GATGTGCTGG ACCTTMCCGC GCGGGACGCR AC CRACAAAG 120 

RAASCGCGCC ANAATATTGG CCACAKTTGG TCACATATTT ACCCAATTMT AYCAGGGAYT 180 

MCCATTCCXG GGACCRACCG CACAATCCCR ATSKTGGTTT GCRAACCCTR ACCGTCCCCA 240 

MYTYCGCCRA STTGAACCAG GGCRAAAAAA CGGCCRAAWY CTCGCCCTGA NTCCCGCTCS 300 

GCGCNAATAA CTAGGCCCAT TKAACGGAAC CGGNGGCCSC NANTTGGCCA ACAGGTCCTR 360 

ACAAAGGGGC CCCASYYCGG CCGGWTCCCW TTYCACNCCC TNKTCTCKTG CCGAATYCGG 420 

WTCCRATNYC CCWTGGGCCT TKTCKYCKYC KYCGGTNCCA AWTCTNGGTA TNCTATRGKG 480 

TCCCCTAAAT SCANATCTGG GCXYCCATTT NCTGGSNTTC NATTT AMMAN SRRCGGTTCT 54 0 

TTCWTTCCRA AACCSSNTGG 3CCCNNMCCA AAAAATGATN ATAATAATGK YGSCTTTCAA 600 

ACCCCGCCCC CCCATTCRWT CSGTTCCANC CCCCNGNGGT TAAGKTGGGA ATTTYTNAMC 660 

YCNARGCCCT NATTTSGGNA AAAACCYCYC GGGYCTCAAA CMNYTTTTTT GSKSSNTCGG 720 

GCTCRTTCSC CAAAACCCAA ATTNTYNYGG GGYCCXTNAA ACMCGGYCRC RCCGGAAATT 78 0 

TTTYTGGTTC AACCCCAACC TTTTCAASCC NTTTTYTYYT TRCCSSCSMN TNGSSGGGNT 84 0 

KSSCCNTTCY RARKKCCNMN GGGGGWYCYN CCCCRMNTTT CTTTTTTTTT CCGTNNMAAM 900 

NGXTTCTTCA AASMCCCCCC SCCCCCNSAA ACCCCCTNAR GTTTTYCMMA AANNWYNNGN 96 0 

KNCCCCCCCC MMNAAAAAAY YCSCCCGNRN ACSMSNGGGA MCCCCCGGSN NTTR K TTTTT 102 0 

TNCMSGYCCC CSRMASYYTT 7KAMAMANRR GAMNSMTTTY TNNRGNWNK 106 9 

(2) INFORMATION FOR SEQ ID NO: 33 3: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1210 base pairs 
(BJ TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : Genomic DNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 333: 

NGNGGGGKWK MATACATCWT TCTTCACGSG GGATCWATTG CGGGCCGCAW TCTNGTMCAA 6 0 

SAG AT C T C G A TYTCGGGCAM NACCCACCWC TCCRAAAAAA ACCCRAAWCT CGGGSKCTYC 12 0 

GARAAGTGTT GCCCGCKTTR AATTTAACAA ATTCAGTGTC ANAGTGTCAC GGCKTTACWT 18 0 
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YCCCGGCAAA GGGGCCACAA CCTGCAGRGA SCACYCRATG GKTGYTGKTS CNCGGGCGGG 24 0 

CCGGKTNAAG GGACCTGCCT GGGTKTGCSC TMCAAANATC WYCCGCGGGT YCGCTGGRAT 300 

MCNCAGGGGT GTCAAAAAAC CGCAAACAGG CACSCCANCC NTTTACGGGS CTTAAAANGA 360 

AAAAGGGCTG ATGCCCCCAA GGGGGCCCGC NCCCAACCTT CCGTTGGTCA ACAACCCGGT 420 

CTCTCKTGCC RAATCCGRWT CCRATNYCNC CWTGGCCTTK TCKYCTYCTY CGGTACCCAA 480 

ATCTGGGTAT CCTATASTGT CCCCTAAWTT CCAAATCTGG GCTGTCCATT TSCTTGGCNT 540 

TCCAAATTTA CCANCAACGG TTTCTTNCAT NCCAAAAACC GNTKGGCKCC NRACCCRAAA 600 

AAATGAATAA TAATAANNGG KCNNTTYCNA ACCNCCCCCC CCCNATTCCA TYSNGTTCCA 660 

NMNCCCCCAG NGGKTAGGTK GGGAAANYYC TCMACCYYCA ANCCCTWARS TTTTNGRAAT 720 

KAAACCCTYC YCNGGGTCWW TYMAAAAAMA NTTATTTGGN NGNTTTCGGG MWNCKRKNST 780 

SCCAAAATCC MAAATANTTT YYTGGTYCNA TWAAAAAMCG YGNCCMNCCC GGAAAAWTTT 840 

TTNTGKTTSA ACCCCAAAAC YTTTTCMNAA NCSSKTTTTY CYTTCCCCCC AMNWTGGGYS 900 

GGGN ATKGYG SCYTNTCTTA TKTKYTYMTW CMGGGGGGNN MKKTCMMCCC CCMTTTYYCY 960 

NYWRTTTTTN KCCCCKTNMR NNRAANNGGN YTCSYNANAA AAGCNCCCCC SCCXNCCCNA 1020 

AAAAWCCCCN NNNARAKTNT TTMKANNRMN SCKCNKNGKY YCCCCCCCWC YNMNNAAAAA 1080 

AATMYCCNCC RASANMCASM NMGGRGNRSC CCCCCCCSTT NNNNTMTTNT TTTTTTCSRA 1140 

GAGCKCCSCG MNNANMKNCX CTTTTTKCNC NNGNNGNGNN GGNGMNCKCC CCNAGAAMWK 12 00 
CTKSTCCCKS 



1210 



(2) INFORMATION FOR SEQ ID NO: 3 34: 

U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1105 base pairs 

(B) TYPE : nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:334: 

NGSSSNGNNA TMCATCWYCT GYACSGGGMT CWATTGCGGC CGCAACTNGT MAASAGATCT 6 0 

CGAAYTCGGC AAKANACACC ACCGCCGTGT MTATACACCG CAAATGTTCT GTKTGCCAAA 120 

ACCGAGACGC GCCGGCCGCG GGGYTCCAAC GCXTTACYTR ACCCGCCAGY TCAGTGTTRA 180 

AACCGGTGYT RAGGGCCGCA CCCAACWTAA ACGCTTTAKC CAAGRAWYTG GKTGGCCCGC 24 0 

AGCCACCTGY TGTGGYTGCC CTCWYCGGTG GTAGCGCCGG TTANCGCCGG TTGCGCGYTC 300 

AMCASCSCGC CGGTRATCCC AKCNWTCCCC CGGCCMRACC CACCGGGCAC TTTGRACGGT 360 

GCCGCCAATT CAAAYCKYCT GRWTCCTTCM AAACACCACR AAGGC CAC CM CCMSCACCNA 420 

ATMGGGRACT TTAAGGCCCA GGCAAAACCT NTRAKCNCCT CCCGGGCRAA GGTCCSGCAA 480 

SCRATCCMAA AAAAKCKNAT TTCCCCCAGC AKCAACCCAA MMCGSTTTGC TGCTTCCGGA 54 0 

TTCGAAMCCA ATTMCWGGKT NCNWGGGAAA AACASCNNCC NWTAKCCMGG CCCMCGGGCA 600 

ATTTCSGRAA SAACCCCTNY CCCGGGTTTT YCCTGCTCMG GCCCAANACC CCCGGGAATC 660 

AAAAASGGTC GGNCAAANGG GCMAAACCCS SACCCMACTT WTTCCRCTTN GGGGGGSCWN 720 

CCXNGTTTAA AWKSCCTCYY CTSCCCAAAY TCGGKCMAAA NNGRKTTGGK TTNGGCNACC 780 

NTTTCCGGKC CCGGGKGKGK WGKYCTMNMA CSTTTNTTTT SCCCCYKAAA NYSCCCCCCC 840 

CGGSSCCCCG CCCGGGGGGA NNTTTTTAMA GKKTYCCCCT CCCCAMAAAA ANACCCCNYC 900 

CCSGGSCCCT TTKRWAAAMN KCTSCCCCNG GNNGGGGKCM GGKTTATTMT NNNCCSCCCC 960 

TCCGCGSAAA AAATAKMTTT SYCCCCCCNC CTCCKNCKNR GKAMSMSCGC TCCCYCTCNC 1020 

GCNKNTWAAN ARSNCCKKNN CCNCYKCCGS NSNGKCNWCD NCCSTSSNCT NKGCNCKNCN 10 80 
KAAANAAYNC NGSMSTSSMN CNKCC 



1105 



(2) INFORMATION FOR SEQ ID NO: 335: 



(i) SEQUENCE CHARACTERISTICS 
(A) LENGTH: 936 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 335 : 

NGSNSNKNNN TAMAYCWYYC TSCACSNGGA ACWANTGCGG CCRMAWCTNS TMKASAGATC 60 

TMGAAYTCGG CAAGAGCGGC AAGAGTGTGT GCATCTGGTC ANAGTSTMMA CRCGGTGCCG 120 

CSGGTGKGTR GASCACMCAT NTGCGRACAC CAAACCCKTC GCGGGYCACC GGCKTCGCCT 180 

GCAAAWYCCT CCAGGCCACC TCRAACAAYW YCTYCTGCAA CGCARGCCGT TYCGCGGCCG 240 

RATCCTGGKT CASYYCGCCK TGCGGTGCCC AAGKTACTGG CSCAYCAAAA CCGCTCCGGG 300 

RAACRAAOTT AAWTYTGCCG AATTTCNTTC CCCTGCGCCT TGATAAATTT NTNAAGCCAC 360 

CGCAAMCCTY CGGGCXTCTC CTCKTGCCRA ATYCGRWTCC RATAYCGCCA TGGCCTNKTC 420 

KYCTYCKYCS GTACCCAAAT CTTGGGTATC CTATANTKYC CCWAAANRCA AWTCTGGGCK 480 

KTCCATKTSC TGGSKTCCRA ATTTAMMACA NCGGTTTCTT TCWTACCAAA AACCSNTGGG 54 0 

CCCCRACCRA AAAAKGATAA TAATAAKGTG CMWWCAAAAC CCCGCCCCCC RRTTCAAYC3 SOO 

GTCCARCACC CCANGNGGTN AGGTNGGAAT TYTMAACCCC CAGCCCATAA SNTTNSGNAA 660 

AAACCCCCCN GGGYMYCAAA AMMCTTTTTG GGGMTTCSGS CCATKGYKCC AAAACCAAAA 720 

TMTTTCYGGT CRWAAAAACC GGCCCNCCCG NAAATTTTTT GKCAACCCCA AACCTTTMAM 780 

CCNNNTTCYY YCCCNSACAA TNGGSGGNKN NGSSCNTTYT TWTTTYYNNA GGGGGGRRWC 840 

SNCCCCNAAN YYCCNAANKG NKCCCGSNMA AAAGAGANTT YCMKAAAAAC CCCCNCNCCC 900 

NAAAYACCCC MAAAKWTTCM AAASMSCNNG YCCCCC 936 

(2) INFORMATION FOR SEQ ID NO: 336: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1042 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 336: 



NNNGNKNNNY ATMMAYTCWY YCTSCACCSG 3GNNWCWATT GCGGCCRMAW KCTTGTMAAS 
AG AT CTMNAA YTCGGCACAG ASSSGCACAG ASCCGCGGCG CTATYCMYCC GYTGCTCATG 
CTCAACACGC TCKTCGGCGW GRATAATGGC NCGCCGCCGG CGCCAACACG YTCAAYTGCT 

TCGCCAACGC CATATNTCAA CAAGGTRATA AAASCAAAAC CGCSCGCCGY GCCCTTGGGC 240 

SCGGRAASCG GTGCCAACCC RAAACNCKTT GGGCACYCGG KTSRACTTTA AASGGTAATC 300 
TCKTCCTCCT GGGCTATGGT GCGCCACAAA CCTSYTGGCG WGGGTCTGGC CCTGGGYCAC 
CGYCRCNTTT TATNTNTCCK YCTACACNCT TKGGTYCAAC CAACCCACTT CACMAAATTG 
TTTTGGGKTG GGGSSGCCGG YTGTNNCCGK TAATAATCSG NTGKTCSGCC MYCACCGGWA 

CCATANCCTG GCCGGCSCTG GCAAATTTCC SAAATCATYT CCTTCTGRAC CCCCACAMRC 540 

CTNSAAATCC GRATCAATNC CCCNKGGCTT NTCYCTCTCN GTRCCCAATY TGGTTTCTAT 600 

RKTNCCCYAA TSCAATTGGS TTYCCRTTSC YGSTTCCAAN TTNACAAMAS GGTTTYTCMT 660 

ACCAAAACCC NTGGSCCNNA CMNAAAAKNA RAAAANAKGG KCTTTYAAAC CCCCCCCTAT 72 0 

TCAWYCGGTN CMRNWCCCCG NGKAAGGKGN GAAAYTTHRA CCCAANCCMT ARSTTSGNAK 780 

AAACCCYYCG GGGTSMCAAA MKNTWTTSSC CTTCGGMCTT YCCAAATMSA AAATYYTCKK 840 

KRMNAAAAMC YGNCCCCSAA ANATTTTTGT NAAMCCCKMA YYTRTTWMCC WTTTTCCYCC 900 

CCMCNNSNSG GNTNCCCTTY TYATTTCYMM MCRNNSGACN CCCCMNTYTT TWTTCKCWCN 960 

MMARGSNNYT RGRMMNMNCC CCNCCCCNAK MTCCNCAAAK NTTTNAACNN NNKYCXCCCC 1020 
CCCMWMNKNC CCCCMNCMTT TM 



60 
120 
180 



360 
420 
480 



1042 
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(2) INFORMATION FOR SEQ ID NO: 337: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1073 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:337: 

NNSGSGMKKK ATAMATCWCT CTSYACCSNG GMTCWATTGC GGCCGMAWTC TNGTMAASAG 60 

ATCTCGAAYT CGGCAAANAK ACGCMAYGTC AAGTGTRAYY CGGTCACATA TCMTCGCGNG 120 

TCAACMCCAA AGCCGNGTCA CCGYCTCCCT GGGGCGCCAC CCCCATCGGT RATGCAACYT 180 

CGCGCGCCAC CGYCAAAAGG KTCWTTRAGG CGCTAAAGGT CAMCAATTCC TRAGGTYMCN 240 

CACCGTTNTT TGGCCC3CCC RAWTYCTRAC CCGCAATWTC GGTAATCGGR AATTTGGGCW 3 00 

YCGGCTTGGG CAATAAGKTN TTGGGCAACG GCGGRWTCYC NCTGGCCGRA ATTCCCNCAT 360 

TCCKTTAACG GKTGRACCGT TTYCCCGGYT GCCGTAAYTG YTYCNTGGGC GCCYTCGGCC 420 

CRNAGCASYY CRCTAACGGY CMCCAGGCAA TACCKTTGGC TTTRAACCAC CGGRATNAAY 480 

TGKTACCCAC YTCAASSGTS CTGRANTTRK TNTCNTGRAA AANMCCACCN AACCCGGNTT 54 0 

RATCTGCTTC MTCANCWTTT SCCGGGTTCT GCCGTTTTGR AAYCTTNATC CMTYCAAAAG 600 

GTTTAMTTTC CCAANRAATT CGGYTTGCCA CCTTGGCCGS GGCTGGTTTM CGMWCCTTRR 66 0 

AMATCCNCCS GCGGGSAAAN AMTTSGGNTT SGSCCGGTCC CCCGNAATAT YCNTGGNCCT 720 

GNAAATTGSS GGGATCCCCN GSGNAYCCGG CCWTKGGGGK TNCC CAGTTG GWACAATTYC 780 

WKCCGTTCCA AACCCGGGNC CGGGGGGTGG GSCCCNTTTT CCTMYNNAAA AAGKGTTTGN 840 

NYYTTTTCCG CNRAANTTCA CCSKCNKTNT GGNCCNAACY YYYCAANTTC CANACCTTTA 900 

AASAAANCYK YGKTYYCCCC TTTTMCCSGS SANCCCCCCM NMSSKNCGGG AAAAAAAGNK 960 

TYNGCCTTAN CNSNKTKTTT TNKTYCCCCC NMWNNSNMCY NC3KKCNKRY NGNSNMNCCT 1020 

MKYSXCNNNN SNNNNNKCGN GSNCSGMKYM CMNNCNGMYK NGNKSNNCCC MSC 1073 

(2) INFORMATION FOR SEQ ID NC : 3 3 8 : 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1061 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:338: 

GNSNGNKNTN TMCAYCWYCT SCACSGGGTC TATTGCGGCC GCAATYTNGT CKASAGATCT 6 0 

CGATYTCGGC AMNANAARTG TCGTCGTCAA TTTCAGKKTG GTCKTCAAAY GGGCCAGGCC 12 0 

GNGACCRACA CCCTGNGTCA CCCAAAANAC CAACAGCWTC AAATWTCAAG GCCRAGGCSC 180 

TRTCAATYCC CRAS CAKTTA ACCGTKTCCW TCRAAGGTGC CRAACCAGGC ACCCAGYTCA 24 0 

CCGCCSGGCA AWTCGCGCTG CCGGCCGGTN TCAGCCTGAT TYCTGACCCT RWTCTGTSGG 3 00 

TGGYCAMCNT GGTGAAGGCC CWWCCGCCNA AGAACTGGAG GGCRAATTCC CAGGANCCNA 36 0 

GRAACCCNAG GAACCCGCGG TAKAANCCGG CRAAACCRAG GCCGYTGGCN ATTCCNATTA 420 

NAMSGGTTTG CRACNTGGCC RAACCGTTTY CTTGGTCGGC CTCGGCAACC CTGGACCANT 480 

TACCCCKTNC CCGGNMCMAC CYCGGGTNCT TGKYCCCAAT NTGCYCCCGC GNRANTNGGC 54 0 

CNAATTCCAG GGCNCCANCT TTCCGGCCCN AATTCCCYTG GTTAATCACC GGGCNCNCC rn 600 

GGTTTTGGGC AACCCCNCYS CTTMTTTAAA CATTCCGSCC CAAATGGGNC STTGGSAAAT 660 
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TCTNTYCGGT GGGGCSGGCR ANMYTTCTCT YCCCNAASAN CTTAMYCCAN TTCGSSNTCC 720 

CGGKCAAAWS NGGGGGGGNA AAGGGCCCCC CGGNTSCKCC GGGGKKGCCC CYGGXTTCAA 780 

AANTTTCSGG GKTSTMSCGG NVTCSCCCCC CSGCCAAGRA CCGNGGTTTT TTTTTGAACC 840 

KCMANTCSSA AMCCGCCSSC CCCMAAAGGS GCCTNAAWGR RAYTTNKSCC CNNAAACSGG 900 

CCCCCAKYTY SGGKTTCNNC CNCCSGKKGT CCMTSTTTMM MRCCCTTTGN GNKTTTTTAN 960 

MGSCCTTNNC CACCCCCTCX GGGXCSMNNA GAAKTMYWKC CNGGGGNNAN RSCCCCCCNN 1020 

GSGKGGGGKG MGAGYSCCKT CTKGCGNCNN YKNTTTCCCC C 1061 



(2) INFORMATION FOR SEQ ID NO: 339: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 986 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DMA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:339: 



GNNGNNNKWN ATMCAYCWYY CTSCACCSGG GMTCWATTGC GGCCGCAWKY TNGTMAASAG 60 

ATCTMGAAYT CGGCACANAG CGGCACAGAG TGTGTGCATC TGTGTCANAG CTGTCAACGC 120 

GGTGCCGCSG GTGGTRASCA CMCATTGCGR AACACCAAAC CCGTCCGCGG GYCACCGGCX 180 

TCGCCTGCAA AAYCCTCCAG GCCACCYCRA AACAAYWYCT CCTGCAACSC ARSCCGTTYC 24 0 

GCGGCCGRAT CCTGGXYCAS YTCGCCXTGC GGTGCGCCAA GGTACTGGCS CWYCRANACC 3 00 

GCTYCGGGRA ACCNAACGTA AATCTTGCCN AATTTGCNTT CCCCCTSCCC TTRATNAATT 360 

TGTTAAACCA CGCAAACCTY CGGGCKTCTC CTCKTGCCRA WTCCGRWTCC RATNYCGCCA 42 0 

TGGCCTNKTC KYCTYCXYCS GTMCCCAAAT CTTGGTATCC TATATTGTCC CTAAATGCAA 480 

ATCTKGGCTG TCCATNTGCT GGCGTTCAAA TTWAMANCAG NGGTTTCTTY CTTCCNAAAC 540 

CCSTTGGCCC CAAACCNAAA AATGATNATA ATAATGGTGC TNTCAAACCC CGCNCCCATY 600 

CNATCSGKCC AMMCC CCRGN GGKTANKKGG GNAATTCTMM AACCCCAAGC CATAASNTTG 660 

3GANAAACCY NCNCMGGYCA CCAAAACANY NTTNTTGGNY SSNTTCGGMN YCATGGCTNN T 20 

CMAAAACCCA AATACTNYYG GGYCCAATAA AAMMMSGGYC SAMCCGGAAA WTTTTYTTGN 780 

XYNAAACCNA AAKCCTTTTT CNAACCCDAN WNTYCCTNCC RCRCMANTGG CNSGGARTXT 84 0 

SSSCTTNCCA ATGKYCCMAA AGNGGGRANA CCARCCCCAA TTCCTNNNTN KNKNCCCNST 900 

TRNAAAAGGG 3XNTYNCMAA AASCNCCNCC MCNCTCCCAA AAKAMCCCCN AAAGAKNTCN 960 

NAANASXYSN NNNSCCCCCC CCMMMN 986 



:2) INFORMATION FCR SEQ ID NO: 340: 

i; SEQUENCE CHARACTERISTICS: 
'A) LENGTH: 1074 base pairs 
IB) TYPE: nucleic acid 
(CI STRANDEDNESS: single 
IV) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Genomic DNA 



(xi) SEQUENCE DESCRIPTION^ SEQ ID NO:340: 

NGNGGGNKRN ATMMAYCWCT SATYYACCSN GGMNMWATTG CGGCCRMAWT CTNGTMKASA 60 

GATCTMGAAA YTCGGCAAAG AGYATXCTCG GGGGCCAGAT TTNTGGCCCG CAACCGCCGC 12 0 

ACTTTG CAYW TCAACAKTCC SGGTGCCCCA AAAAAWTCWT ACCCCCATMC TYCKTGCASM 18 0 

ASYTGCGCCC RATTRAACAC CCGGCCGGCW TGCTGCGCCA GGTATTYCAS CAGYTCAAAY 240 

YCTTTXTAGX TAAAATCCAG CSGGC3GCCA CNCAGCCGGG CGGTXTAGGT GCCTYCRTCA 300 



WO 99/42076 



PCT/US99/03268 



266 



ATMACCAGCY CGCCCAGGGY CACCTTGCCC AAAAYCTCCT GGGTCAGCCA AATTYCCGCS 360 

CCGGCCAACM ACCANCCGCA TYCTGGCNTC AATCYCACCG GGCCCGGTGY TAAAMMANMA 42 0 

GRATCTCKTC MANCCCCCAN TCAGCSYTNA CNGCMACAGC CCGCCTTCTT CAMACCGCCA 48 0 

RTACCGGGWT CAACCGGCCS GTCAAACTCA ACAGGCGGNC AGGCCTCCCC CGGANSAAAG 540 

GTCTTACSCC NNYAANAAAA MAAGNTCTGT TTTCCCCCTC CASAASNAAA AANCCCCSGC 600 

CGGGCCTTCN NMMGGGTTTG GGGMANANAA AARCNCCGGN GGAACGNATC CGAAAMCTCC 660 

CAAGTCNCMT TWAWAACYCN NNAACCCCCC ANTTTTGGGA AAGGNTCCCC NTTMYCCCCC 720 

TTTTASGKTS GGGMMYYCTY TAAAAAAATT CCCCAAAAAG CCCCGGGAAG GGTCMAMCTG 780 

GGNAAATTTC CAAMCCNWGK TTNTTYNGGT TMCGGGGGRA AATTYCNCTC CCYYNNNGGG 840 

CSSGSNNNAT TAYGGMSNMT TTTNNAAWTM NSGKKTSAMM YNNXCCMNNN SNNMSMANNK 900 

TNAMCKCCCN CCTCNGNGKY CSCYNCCCSG GHAGNGGRAS MKCCNANMAA AYASGNTTNK 960 

CGGAAMMCNN AATKGNNNSC CCGGASMCMN NNNMAAATMT CNCNKCNSNN AANRGMRACN 1020 

CCCNSNSGMN RRGAARMTNY YCCCCCGSKM GKGNKAAAAW GKYCCCCCCM AAAG 10 74 

(2) INFORMATION FOR SEQ ID NO: 341: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1195 base pairs 

(B) TYPE: nucleic acid* 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 341: 

NGNGNCNKNT MTACATCWTT CTGCACCSGG GNTCWANTGC GGCCGCAWKY TTGTCGASAG 60 

ATCTCGAAYT CGGCAMGAGG ACWCTCGCRA CGCCCCCACA NACTCTGGCG TGTGTACCCC 120 

ATTGNGCGCK TCACGCGCCC AYTGANCCAK TNCACTGGGG TGCCGTYCGC CKTGCGCGGC 180 

GGCCTCACGG CKCTSCMTCT RAAGGCWTGG CGCACCGCAT TCGGTTTTCT RAACGCGGG 240 

AAAWTGGCCA GCCGTCTGGC TCATGGGNTC TACGCAACGC CNGCCCCCAA CRCTTTCTTA 300 

AATCCGGYCC NTCCTGANCS CTTTGAAYCC CGGGGSAAGA ACTGGTTGCS CNCGAYCTGC 360 

TCGAACTTRK TCNAAATCCC GCANAKTGTT 7CNTAMGYCC CNCCGGAAGG NGAACCACT 420 

TTCNGGWANG TCGGCNKCCG GCGCTTATCA STCCTGATCA ACGGGGAACT GGYKNNSTTG 480 

KGGGAAAAAG RRCCTCAATG MTYGGTCCKC GCTGCGKANC CGCSCCCTGK GYCGCNAATG 540 

GAAGGCSMAG GGTTAANGCC MTTYCNYCCR RSCCGTSTGA SGKWTTYCGG MGGANKAMNN 600 

NNKMAMWTTK TCRGNGGCCW ATSTSCCGGG CXSTTAKAGA ANACTYCCXW WCCGTUT'SC 660 

3AAAGNTKCS GCGMGTTTTS SCCKMGANGN YCTGATTTSA GGGGGKYKCC CCCGGGGTYC 720 

CGAAWKWRKY CCYAGGGGGM GNYCSAGCSC CGMNNATNAG AGNAAGGKTT RYGSTSKNCC 780 

TYTNKGGACC WSCNNCWSAK ANAACNNKKT TGCSCCNTMS AGNKTNKGRT YCCNKTSTTC 840 

TAAGAGGAGC TATKMKCGCC CXTGGANGMM GAGWGMGCGC KYCCCSNKRT TCNTNGWAAA 900 

TATKSAGMGG TXCCGMAGMK CCSCGTTTKT TKTGANAAMN MSMRKNXKTG CGMGYTCTSC 960 

GGGNTTTGTA GAGTAKTCGS CSCSSMWGAC WCSGMCMGNG AGKNKTNNTS YANTGARCGY 1020 

MNNSKTMKMT MSCSCGCGNA GGAGNGCCCC CSANGMSTGY NKGGNMSSNG ARAKGATGGS 108 0 

GGCCNCGMNN MGMGGANMGA SANNGMGGMR GGGGGKTGKC TCKCSCCGNS CSANGRAGAA 1140 

GKTCNGSCGC CGMGGKYGKT KTKTKNKTGG YSTCMSSMMM NAGAAAAGAG AGGGC 1195 

(2) INFORMATION FOR SEQ ID NO: 3 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3572 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 
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(ii) MOLECULE TYPE : Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:342: 

CCATCTGATC GTTGGCAACC AGCATCGCAG TGGGAACGAT GCCCTCATTC AGCATT""GCA 60 

TGGTTTGTTG AAAACCGGAC ATGGCACTCC AGTCGCCTTC CCGTTCCGCT ATCGGC"GAA 12 0 

TTTGATTGCG AGTGAGATAT TTATGCCAGC CAGCCAGACG CAGACGCGCC GAGACAGAAC 180 

TTAATGGGCC CGCTAACAGC GCGATTTGCT GGTGACCCAA TGCGACCAGA TGCTCCACGC 24 0 

CCAGTCGCGT ACCGTCTTCA TGGGAGAAAA TAATACTGTT GATGGGTGTC TGGTCAGAGA 300 

CATCAAGAAA TAACGCCGGA ACATTAGTGC AGGCAGCTTC CACAGCAATG GCATCCTGGT 360 

CATCCAGCGG ATAGTTAATG ATCAGCCCAC TGACGCGTTG CGCGAGAAGA TTGTGCACCG 420 

CCGCTTTACA GGCTTCGACG CCGCTTCGTT CTACCATCGA CACCACCACG CTGGCACCCA 480 

GTTGATCGGC GCGAGATTTA ATCGCCGCGA CAATTTGCGA CGGCGCGTGC AGGGCCAGAC S40 

TGGAGGTGGC AACGCCAATC AGCAACGACT GTTTGCCCGC CAGTTGTTGT GCCACGCGG T 600 

TGGGAATGTA ATTCAGCTCC GCCATCGCCG CTTCCACTTT TTCCCGCGTT TTCGCAGAAA 660 

CGTGGCTGGC CTGGTTCACC ACGCGGGAAA CGGTCTGATA AGAGACACCG GCATACTCTG 720 

CGACATCGTA TAACGTTACT GGTTTCACAT TCACCACCCT GAATTGACTC TCTTCCGGGC 78 0 

GC7ATCATGC CATACCGCGA AAGGTTTTGC GCCATTCGAT GGTGTCCGGG ATCTC GACGC 84 0 

TCTCCCTTAT GCGACTCCTG CATTAGGAAG CAGCCCAGTA GTAGGTTGAG GCCGTTGAGC 900 

ACCGCGGCCG CAAGGAATGG TGCATGCAAG GAGATGGCGC CCAACAGTCC CCCGGCCACG 960 

GGGCCTGCCA CCATACCCAC GCCGAAACAA GCGCTCATGA GCCCGAAGTG GCGAGCCCGA 102 0 

TCTTCCCCAT CGGTGATGTC GGCGATATAG GCGCCAGCAA CCGCACCTGT GGCGCCGGTG 1080 

ATGCCGGCCA CGATGCGTCC GGCGTAGAGG ATCGAGATCT CGATCCCGCG AAATTAATAC 1140 

GACTCACTAT AGGGGAATTG TGAGCGGATA ACAATTCCCC TCTAGAAATA ATTTTGTTTA 1200 

ACTTTAAGAA GGAGATATAC ATATGGGCCA TCATCATCAT CATCACGTGA TCGACATCAT 1260 

CGGGACCAGC CCCACATCCT GGGAACAGGC GGCGGCGGAG GCGGTCCAGC GGGCGCGGGA 1320 

TAGCG7CGAT GACATCCGCG TCGCTCGGGT CATTGAGCAG GACATGGCCG TGGACAGCGC 1380 

CGGCAAGATC ACCTACCGCA TCAAGCTCGA AGTGTCGTTC AAGATGAGGC CGGCGCAAC 1440 

GAGGGGCTCG AAACCACCGA GCGGTTCGCC TGAAACGGGC GCCGGCGCCG GTACTGTCGC 1500 

GACTACCCCC GCGTCGTCGC CGGTGACGTT GGCGGAGACC GGTAGCACGC TGCTCTACCC 1560 

GCTG..CAAC CTGTGGGGTC CGGCCTTTCA CGAGAGGTAT CC3AACGTCA CGATCACCGC '62 0 

™ A ™^^f C ^ G I;f rG ^ TG CCGGGAT CGC GCAGGCCGCC GCCGGGACGG TCAACATTGG 1680 

GGCCTCCGAC GCCTATCTGT CGGAAGGTGA 7ATGGCCGCG CACAAGGGGC TGATGAACAT 1740 

CGCGv-.AGCC A7C7CCGC7C AGCAGGTCAA CTACAACCTG CCCGGAGTGA GCGAGCACC" 1800 

CAAG C7GAAC GGAAAAGTCC TGGCGGCCAT GTACCAGGGC ACCATCAAAA CCTGGGACGA 186 0 

CCCGCAGATC GCTGCGCTCA ACCCC3GCGT GAACCTGCCC GGCACCGCGG TAGTTCCGC7 1920 

GCACrGCTCC GACGGGTCCG GTGACACC77 C77GT7CACC CAGTACCTG7 CCAAGCAAGA ^980 

TCCCGAGGGC TGGGGCAAGT CGCCCGGC77 CGGCACCACC GTCGACTTCC CGGCGGTGCC ">040 

GGGTGCGCTG GGTGAGAACG GCAACGGCGG CATGGTGACC GGTTGCGCCG AGACACCGGG "100 

C7GCG7GGCC 7ATATCGGCA 7CAGC77CC7 CGACCAGGCC AGTCAACGGG GAC7CGGCGA 2160 

GGCCCAAC7A GGCAATAGCT CTGGCAATTT CTTGTTGCCC GACGCGCAAA GCATTCAGGC -2">0 

CGCGGCGGC7 GGCTTCGCAT CGAAAACCCC GGCGAACCAG GCGATTTCGA TGATCGACGG ^280 

GCCCGCCCCG GACGGCTACC CGATCATCAA CTACGAGTAC GCCATCGTCA ACAACCGGCA ->340 

AAAGGACGCC GCCACCGCGC AGACCTTGCA GGCATTTCTG CACTGGGCGA TCACCGACGG 2400 

CAACAAGGCC TCGTTCCTCG ACCAGGTTCA TTTCCAGCCG CTGCCGCCCG CGGTGGTGAA 2460 

GT7G7C7GAC GCGTTGATCG CGACGAT7TC CAGCGCTGAG ATGAAGACCG ATGCCGCTAC 2520 

CCTCGCGCAG GAGGCAGGTA ATTTCGAGCG GATCTCCGGC GACCTGAAAA CCCAGATCGA 2 580 

CCAGGTGGAG TCGACGGCAG GTTCGTTGCA GGGCCAGTGG CGCGGCGCGG CGGGGACGGC 2640 

CGCCCAGGCC GCGGTGGTGC GCTTCCAAGA AGCAGCCAAT AAGCAGAAGC AGGAACTCGA 2700 

CGAGATCTCG ACGAATATTC GTCAGGCCGG CGTCCAATAC TCGAGGGCCG ACGAGGAGCA 2760 

GCAGCAGGCG CTGTCCTCGC AAATGGGCTT TGGATTCAGC TTCGCGCTGC CTG CTGGCG 2820 

GG7GGAG7C7 GACGCCGCCC ACTTCGACTA CGGTTCAGCA CTCCTCA6CA AAACCACCGG 2880 

GGACCCGCCA 77TCCCGGAC AGCCGCCGCC GGTGGCCAAT GACACCCGTA TCG7GC7CGG 2 94 0 

CCGGC7AGAC CAAAAGCTTT ACGCCAGCGC CGAAGCCACC GACTCCAAGG CCGCGGCCCG 3000 

G..GGGC7CG GACATGGGTG AGTTCTATAT GCCCTACCCG GGCACGCGGA 7CAACCAGGA 306 0 
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AACCGTCTCG CTYGACGCCA ACGGGGTGTC TGGAAGCGCG TCGTATTACG AAGTCAAGTT 3120 

CAGCGATCCG AGTAAGCCGA ACGGCCAGAT CTGGACGGGC GTAATCGGCT CGCCCGCGGC 3180 

GAACGCACCG GACGCCGGGC CCCCTCAGCG CTGGTTTGTG GTATGGCTCG GGACCGCCAA 3240 

CAACCCGGTG GACAAGGGCG CGGCCAAGGC GCTGGCCGAA TCGATCCGGC CTTTGGTCGC 3300 

CCCGCCGCCG GCGCCGGCCG GGGAAGTCGC TCCTACCCCG ACGACACCGA CACCGCAGCG 3360 

GACCTTACCG GCCTGAGAAT TCTGCAGATA TCCATCACAC TGGCGGCCGC TCGAGCACCA 3420 

CCACCACCAC CACTGAGATC CGGCTGCTAA CAAAGCCCGA AAGGAAGCTG AGTTGGCTGC 3480 

TGCCACCGCT GAGCAATAAC TAGCATAACC CCTTGGGGCC TCTAAACGGG TCTTGAGGGG 3540 

TTTTTTGCTG AAAGGAGGAA CTATATCCGG AT 3572 

(2) INFORMATION FOR SEQ ID NO:343: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii J MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:343: 

Val Gin Phe Gin Ser Gly Gly Asp Asn Ser Pro Ala Val Tyr Xaa Xaa 

15 10 is 

Asp Gly Xaa Arg 

20 

(2) INFORMATION FOR SEQ ID NO: 344: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:344: 

Thr Thr Val Pro Xaa Val Thr Glu Ala Arg 

15 10 

(2) INFORMATION FOR SEQ ID NO: 345: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:345: 
Thr Thr Pro Ser Xaa Val Ala Phe Ala Arg 
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1 5 io 

(2) INFORMATION FOR SEQ ID NO: 346: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 346: 
Asp Ala Gly Lys Xaa Ala Gly Xaa Asp Val Xaa Arg 



(2) INFORMATION FOR SEQ ID NO: 347: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: pepride 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 347: 

Thr Xaa Glu Glu Xaa Gin Glu Ser Phe Asn Ser Ala Ala Pro Gly Asn 

15 10 15 

Xaa Lys 



(2) INFORMATION FOR SEQ ID NO: 348: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 348: 
CTAGTTAGTA CTCAGTCGCA GACCGTG 

(2) INFORMATION FOR SEQ ID NO: 34 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: Other 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 34 9: 

GCAGTGACGA ATTCACTTCG ACTCC 25 
(2) INFORMATION FOR SEQ ID NO: 350: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2412 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNES5 : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : cDNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 3 50: 

CATATGGGCC ATCATCATCA TCATCACGTG ATCGACATCA TCGGGACCAG CCCCACATCC 60 

TGGGAACAGG CGGCGGCGGA GGCGGTCCAG CGGGCGCGGG ATAGCGTCGA TGACATCCGC 120 

GTCGCTCGGG TCATTGAGCA GGACATGGCC GTGGACAGCG CCGGCAAGAT CACCTACCGC 180 

ATCAAGCTCG AAGTGTCGTT CAAGATGAGG CCGGCGCAAC CGAGGGGCTC GAAACCACCG 240 

AGCGGTTCGC CTGAAACGGG CGCCGGCGCC GGTACTGTCG CGACTACCCC CGCGTCGTCG 300 

CCGGTGACGT TGGCGGAGAC CGGTAGCACG CTGCTCTACC CGCTGTTCAA CCTGTGGGGT 360 

CCGGCCTTTC ACGAGAGGTA TCCGAACGTC ACGATCACCG CTCAGGGCAC CGGTTCTGGT 420 

GCCGGGATCG CGCAGGCCGC CGCCGGGACG GTCAACATTG GGGCCTCCGA CGCCTATCTG 480 

TCGGAAGGTG ATATGGCCGC GCACAAGGGG CTGATGAACA TCGCGCTAGC CATCTCCGCT 54 0 

CAGCAGGTCA ACTACAACCT GCCCGGAGTG AGCGAGCACC TCAAGCTGAA CGGAAAAGTC 600 

CTGGCGGCCA TGTACCAGGG CAC CATC AAA ACCTGGGACG ACCCGCAGAT CGCTGCGCTC 66 0 

AACCCCGGCG TGAACCTGCC CGGCACCGCG GTAGTTCCGC TGCACCGCTC CGACGGGTCC 720 

GGTGACACCT TCTTGTTCAC CCAGTACCTG TCCAAGCAAG ATCCCGAGGG CTGGGGCAAG 780 

TCGCCCGGCT TCGGCACCAC CGTCGACTTC CCGGCGGTGC CGGGTGCGCT GGGTGAGAAC 84 0 

GGCAACGGCG GCATGGTGAC CGGTTGCGCC GAGACACCGG GCTGCGTGGC CTATATCGGC 900 

ATCAGCTTCC TCGACCAGGC CAGTCAACGG GGACTCGGCG AGGCCCAACT AGGCAATAGC 960 

TCTGGCAATT TCTTGTTGCC CGACGCGCAA AGCATTCAGG CCGCGGCGGC TGGCTTCGCA 1020 

TCGAAAACCC CGGCGAACCA GGCGATTTCG ATGATCGACG GGCCCGCCCC GGACGGCTAC 108 0 

C CG AT CAT CA ACTACGAGTA CGCCATCGTC AACAACCGGC AAAAGGACGC CGCCACCGCG 114 0 

CAGACCTTGC AGGCATTTCT GCACTGGGCG ATCACCGACG GCAACAAGGC CTCGTTCCTC 1200 

GACCAGGTTC ATTTCCAGCC GCTGCCGCCC GCGGTGGTGA AGTTGTCTGA CGCGTTGATC 1260 

GCGACGATTT CCAGCGCTGA GATGAAGACC GATGCCGCTA CCCTCGCGCA GGAGGCAGGT 13 20 

AATTTCGAGC GGATCTCCGG CGACCTGAAA ACCCAGATCG ACCAGGTGGA GTCGACGGCA 1380 

GGTTCGTTGC AGGGCCAGTG GCGCGGCGCG GCGGGGACGG CCGCCCAGGC CGCGGTGGTG 1440 

CGCTTCCAAG AAGCAGCCAA TAAGCAGAAG CAGGAACTCG ACGAGATCTC GACGAATATT 1500 

CGTCAGGCCG GCGTCCAATA CTCGAGGGCC GACGAGGAGC AGCAGCAGGC GCTGTCCTCG 1560 

CAAATGGGCT TTGTGCCCAC AACGGCCGCC TCGCCGCCGT CGACCGCTGC AGCGCCACCC 1620 

GCACCGGCGA CACCTGTTGC CCCCCCACCA CCGGCCGCCG CCAACACGCC GAATGCCCAG 1680 

CCGGGCGATC CCAACGCAGC ACCTCCGCCG GCCGACCCGA ACGCACCGCC GCCACCTGTC 1740 

ATTGCCCCAA ACGCACCCCA ACCTGTCCGG ATCGACAACC CGGTTGGAGG ATTCAGCTTC 18 0 0 

GCGCTGCCTG CTGGCTGGGT GGAGTCTGAC GCCGCCCACT TCGACTACGG TTCAGCACTC 186 0 

CTCAG CAAAA CCACCGGGGA CCCGCCATTT CCCGGACAGC CGCCGCCGGT GGCCAATGAC 1920 

ACCCGTATCG TGCTCGGCCG GCTAGACCAA AAGCTTTACG CCAGCGCCGA AGCCACCGAC 1980 

TCCAAGGCCG CGGCCCGGTT GGGCTCGGAC ATGGGTGAGT TCTATATGCC CTACCCGGGC 204 0 

ACCCGGATCA ACCAGGAAAC CGTCTCGCTC GACGCCAACG GGGTGTCTGG AAGCGCGTCG 2100 

TATTACGAAG TCAAGTTCAG CG AT CCGAGT AAGCCGAACG GC CAGATCTG GACGGGCGTA 215 0 
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ATCGGCTCGC CCGCGGCGAA CGCACCGGAC GCCGGGCCCC CTCAGCGCTG GTTTGTGGTA 2220 

TGGCTCGGGA CCGCCAACAA CCCGGTGGAC AAGGGCGCGG CCAAGGCGCT GGCCGAATCG 2280 

ATCCGGCCTT TGGTCGCCCC GCCGCCGGCG CCGGCACCGG CTCCTGCAGA GCCCGCTCCG 2340 

GCGCCGGCGC CGGCCGGGGA AGTCGCTCCT ACCCCGACGA CACCGACACC GCAGCGGACC 2400 

TTACCGGCCT GA 2412 



(2) INFORMATION FOR SEQ ID NO: 351: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 802 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 351: 



Met: Gly His His 
1 

Pro Thr Ser Trp 
20 

Asp Ser Val Asp 
35 

Ala Val Asp Ser 
50 

Ser Phe Lys Met 
65 

Gly Ser Pro Glu 

Ala Ser Ser Pro 
100 

Pro Leu Phe Asn 
115 

Val Thr lie Thr 
130 

Ala Ala Ala Gly 
145 

Glu Gly Asp Met 

He Ser Ala Gin 
180 

Leu Lys Leu Asn 
195 

Lys Thr Trp Asp 
210 

Leu Pro Gly Thr 
225 

Asp Thr Phe Leu 

Trp Gly Lys Ser 
260 

Pro Gly Ala Leu 
275 

Ala Glu Thr Pro 



His His His His 
5 

Glu Gin Ala Ala 

Asp He Arg Val 
40 

Ala Gly Lys He 
55 

Arg Pro Ala Gin 
70 

Thr Gly Ala Gly 
85 

Val Thr Leu Ala 

Leu Trp Gly Pro 
120 

Ala Gin Gly Thr 
135 

Thr Val Asn He 

150 

Ala Ala His Lys 
165 

Gin Val Asn Tyr 

Gly Lys Val Leu 
200 

Asp Pro Gin He 
215 

Ala Val Val Pro 
230 

Phe Thr Gin Tyr 
245 

Pro Gly Phe Gly 

Gly Glu Asn Gly 
280 

Gly Cys Val Ala 



Val He Asp He 
10 

Ala Glu Ala Val 
25 

Ala Arg Val He 

Thr Tyr Arg He 
60 

Pro Arg Gly Ser 
75 

Ala Gly Thr Val 
90 

Glu Thr Gly Ser 
105 

Ala Phe His Glu 

Gly Ser Gly Ala 
140 

Gly Ala Ser Asp 
155 

Gly Leu Men Asn 
170 

Asn Leu Pro Gly 
185 

Ala Ala Met Tyr 

Ala Ala Leu Asn 
220 

Leu His Arg Ser 
235 

Leu Ser Lys Gin 
250 

Thr Thr Val Asp 
265 

Asn Gly Gly Met 
Tyr He Gly He 



He Gly Thr Ser 
15 

Gin Arg Ala Arg 
30 

Glu Gin Asp Met 
45 

Lys Leu Glu Val 

Lys Pro Pro Ser 
80 

Ala Thr Thr Pro 
95 

Thr Leu Leu Tyr 
110 

Arg Tyr Pro Asn 
125 

Gly He Ala Gin 

Ala Tyr Leu Ser 
160 

He Ala Leu Ala 
175 

Val Ser Glu His 
190 

Gin Gly Thr He 
205 

Pro Gly Val Asn 

Asp Gly Ser Gly 
240 

Asp Pro Glu Gly 
255 

Phe Pro Ala Val 
270 

Val Thr Gly Cys 
285 

Ser Phe Leu Asp 
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290 










295 










300 










Gin 


Ala 


Ser 


Gin Arg Gly Leu 


Gly 


Glu Ala 


Gin 


Leu 


Gly 


Asn 


Ser 


Ser 


305 










310 










315 










320 


Gly Asn 


Phe 


Leu 


Leu 


Pro 


Asp 


Ala 


Gin 


Ser 


He 


Gin 


Ala 


Ala 


Ala 


Ala 










325 










330 










335 




Gly 


Phe 


Ala 


Ser 


Lys 


Thr 


Pro 


Ala 


Asn 


Gin 


Ala 


He 


Ser 


Met 


He 


Asp 








340 










345 










3S0 




Gly 


Pro 


Ala 


Pro 


Asp Gly Tyr 


Pro 


He 


He 


Asn 


Tyr 


Glu 


Tyr 


Ala 


He 






355 










360 










365 






Val 


Asn 


Asn Arg Gin Lys Asp 


Ala 


Ala 


Thr 


Ala 


Gin 


Thr 


Leu 


Gin 


Ala 




370 










375 










380 










Phe 


Leu 


His 


Trp 


Ala 


lie 


Thr 


Asp 


Gly Asn 


Lys 


Ala 


Ser 


Phe 


Leu 


Asp 


385 










390 










395 










400 


Gin 


Val 


His 


Phe 


Gin 


Pro 


Leu 


Pro 


Pro 


Ala 


Val 


Val 


Lys 


Leu 


Ser 


Asp 










405 










410 










415 


Ala 


Leu 


lie 


Ala 


Thr 


lie 


Ser 


Ser 


Ala 


Glu 


Met 


Lys 


Thr 


Asp 


Ala 


Ala 








420 










425 










430 






Thr 


Leu 


Ala 


Gin 


Glu Ala Gly 


Asn 


Phe 


Glu 


Arg 


He 


Ser 


Gly 


Asp 


Leu 






435 










440 










445 








Lys 


Thr 


Gin 


He 


Asp 


Gin 


Val 


Glu 


Ser 


Thr 


Ala 


Gly 


Ser 


Leu 


Gin 


Gly 




450 










455 










460 








Gin 


Trp Arg 


Gly Ala Ala 


Gly 


Thr 


Ala 


Ala 


Gin 


Ala 


Ala 


Val 


Val 


Arg 


465 










470 










475 










460 


Phe 


Gin 


Glu 


Ala 


Ala 


Asn 


Lys 


Gin 


Lys 


Gin 


Glu 


Leu 


Asp 


Glu 


He 


Ser 










485 










490 










495 




Thr 


Asn 


He 


Arg Gin Ala Gly 


Val 


Gin 


Tyr 


Ser 


Arg 


Ala 


Asp 


Glu 


Glu 








500 










505 










510 






Gin 


Gin 


Gin 


Ala 


Leu 


Ser 


Ser 


Gin 


Met 


Gly 


Phe 


Val 


Pro 


Thr 


Thr 


Ala 






515 










520 










525 








Ala 


Ser 


Pro 


Pro 


Ser 


Thr 


Ala 


Ala 


Ala 


Pro 


Pro 


Ala 


Pro 


Ala 


Thr 


Pro 




530 










535 










540 










Val 


Ala 


Pro 


Pro 


Pro 


Pro 


Ala 


Ala 


Ala 


Asn 


Thr 


Pro 


Asn 


Ala 


Gin 


Pro 


545 










550 










555 










560 


Gly Asp 


Pro 


Asn 


Ala 


Ala 


Pro 


Pro 


Pro 


Ala 


Asp 


Pro 


Asn 


Ala 


Pro 


Pro 










565 










570 










575 




Pro 


Pro 


val 


He 


Ala 


Pro 


Asn 


Ala 


Pro 


Gin 


Pro 


Val 


Arg 


He 


Asp 


Asn 








580 










585 










590 




Pro 


Val 


Gly Gly 


Phe 


Ser 


Phe 


Ala 


Leu 


Pro 


Ala 


Gly 


Trp 


Val 


Glu 


Ser 






595 










600 










605 








Asp 


Ala 


Ala 


His 


Phe 


Asp 


Tyr 


Gly 


Ser 


Ala 


Leu 


Leu 


Ser 


Lys 


Thr 


Thr 




610 










615 










620 








Gly Asp 


Pro 


Pro 


Phe 


Pro 


Gly 


Gin 


Pro 


Pro 


Pro 


Val 


Ala 


Asn 


Asp 


Thr 


625 










630 










635 








64 0 


Arg 


lie 


Val 


Leu Gly Arg Leu 


Asp 


Gin 


Lys 


Leu 


Tyr 


Ala 


Ser 


Ala 


Glu 










645 










650 










655 




Ala 


Thr 


Asp 


Ser 


Lys 


Ala 


Ala 


Ala 


Arg 


Leu 


Gly 


Ser 


Asp 


Met 


Gly 


Glu 








660 










665 










670 






Phe 


Tyr 


Met 


Pro 


Tyr 


Pro 


Gly 


Thr 


Arg 


He 


Asn 


Gin 


Glu 


Thr 


Val 


Ser 






675 










68Q 










685 








Leu 


Asp 


Ala 


Asn 


Gly Val 


Ser 


Gly 


Ser 


Ala 


Ser 


Tyr 


Tyr 


Glu 


Val 


Lys 




690 










695 










700 








Phe 


Ser 


Asp 


Pro 


Ser 


Lys 


Pro 


Asn 


Gly Gin 


He 


Trp 


Thr 


Gly 


Val 


He 


705 










710 










715 










720 


Gly 


Ser 


Pro 


Ala 


Ala 


Asn 


Ala 


Pro 


Asp 


Ala 


Gly 


Pro 


Pro 


Gin 


Arg 


Trp 










725 










730 










735 
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Phe Val Val Trp Leu Gly Thr Ala 
740 

Ala Lys Ala Leu Ala Glu Ser He 
755 760 
Ala Pro Ala Pro Ala Pro Ala Glu 

770 775 
Gly Glu Val Ala Pro Thr Pro Thr 
785 790 
Pro Ala 



Asn Asn Pro Val Asp Lys Gly Ala 
745 750 
Arg Pro Leu Val Ala Pro Pro Pro 
765 

Pro Ala Pro Ala Pro Ala Pro Ala 
780 

Thr Pro Thr Pro Gin Arg Thr Leu 
795 800 



(2) INFORMATION FOR SEQ ID NO: 3 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 352: 
GGATCCAAAC CACCGAGCGG TTCGCCTGAA ACGG 34 
(2) INFORMATION FOR SEQ ID NO: 3 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO:353: 
CGCTGCGAAT TCACCTCCGG AGGAAATCGT CGCGATC 37 
(2) INFORMATION FOR SEQ ID NO: 354: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1962 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO-.354: 

CATATGGGCC ATCAT CATCA TCATCACGGA TCCAAACCAC CGAGCGGTTC GCCTGAAACG 60 

GGCGCCGGCG CCGGTACTGT CGCGACTACC CCCGCGTCGT CGCCGGTGAC GTTGGCGGAG 120 

ACCGGTAGCA CGCTGCTCTA CCCGCTGTTC AACCTGTGGG GTCCGGCCTT TCACGAGAGG 180 

TATCCGAACG TCACGATCAC CGCTCAGGGC ACCGGTTCTG GTGCCGGGAT CGCGCAGGCC 24 0 
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GCCGCCGGGA CGGTCAACAT TGGGGCCTCC GACGCCTATC TGTCGGAAGG TGATATGGCC 300 

GCGCACAAGG GGCTGATGAA CATCGCGCTA GCCATCTCCG CTCAGCAGGT CAACTACAAC 360 

CTGCCCGGAG TGAGCGAGCA CCTCAAGCTG AACGGAAAAG TCCTGGCGGC CATGTACCAG 420 

GGCACCATCA AAACCTGGGA CGACCCGCAG ATCGCTGCGC TCAACCCCGG CGTGAACCTG 480 

CCCGGCACCG CGGTAGTTCC GCTGCACCGC TCCGACGGGT CCGGTGACAC CTTCTTGTTC 540 

ACCCAGTACC TGTCCAAGCA AGATCCCGAG GGCTGGGGCA AGTCGCCCGG CTTCGGCACC 600 

ACCGTCGACT TCCCGGCGGT GCCGGGTGCG CTGGGTGAGA ACGGCAACGG CGGCATGGTG 660 

ACCGGTTGCG CCGAGACACC GGGCTGCGTG GCCTATATCG GCATCAGCTT CCTCGACCAG 720 

GCCAGTCAAC GGGGACTCGG CGAGGCCCAA CTAGGCAATA GCTCTGGCAA TTTCTTGTTG 780 

CCCGACGCGC AAAGCATTCA GGCCGCGGCG GCTGGCTTCG CATCGAAAAC CCCGGCGAAC 840 

CAGGCGATTT CGATGATCGA CGGGCCCGCC CCGGACGGCT ACCCGATCAT CAACTACGAG 900 

TACGCCATCG TCAACAACCG GCAAAAGGAC GCCGCCACCG CGCAGACCTT GCAGGCATTT 960 

CTGCACTGGG CQATCACCGA CGGCAACAAG GCCTCGTTCC TCGACCAGGT TCATTTCCAG 1020 

CCGCTGCCGC CCGCGGTGGT GAAGTTGTCT GACGCGTTGA TCGCGACGAT TTCCTCCGGA 1080 

GGTGGCAGTG GGGGAGGCTC AGGTGGAGGT TCTGGCGGGA GCGTGCCCAC AACGGCCGCC 1140 

TCGCCGCCGT CGACCGCTGC AGCGCCACCC GCACCGGCGA CACCTGTTGC CCCCCCACCA 1200 

CCGGCCGCCG CCAACACGCC GAATGCCCAG CCGGGCGATC CCAACGCAGC ACCTCCGCCG 1260 

GCCGACCCGA ACGCACCGCC GCCACCTGTC ATTGCCCCAA ACGCACCCCA ACCTGTCCGG 1320 

ATCGACAACC CGGTTGGAGG ATTCAGCTTC GCGCTGCCTG CTGGCTGGGT GGAGTCTGAC 1380 

GCCGCCCACT TCGACTACGG TTCAGCACTC CTCAGCAAAA CCACCGGGGA CCCGCCATTT 1440 

CCCGGACAGC CGCCGCCGGT GGCCAATGAC ACCCGTATCG TGCTCGGCCG GCTAGACCAA 1500 

AAGCTTTACG CCAGCGCCGA AGCCACCGAC TCCAAGGCCG CGGCCCGGTT GGGCTCGGAC 1560 

ATGGGTGAGT TCTATATGCC CTACCCGGGC ACCCGGATCA ACCAGGAAAC CGTCTCGCTC 1620 

GACGCCAACG GGGTGTCTGG AAGCGCGTCG TATTACGAAG TCAAGTTCAG CGATCCGAGT 1680 

AAGCCGAACG GCCAGATCTG GACGGGCGTA ATCGGCTCGC CCGCGGCGAA CGCACCGGAC 1740 

GCCGGGCCCC CTCAGCGCTG GTTTGTGGTA TGGCTCGGGA CCGCCAACAA CCCGGTGGAC 1800 

AAGGGCGCGG CCAAGGCGCT GGCCGAATCG ATCCGGCCTT TGGTCGCCCC GCCGCCGGCG 1860 

CCGGCACCGG CTCCTGCAGA GCCCGCTCCG GCGCCGGCGC CGGCCGGGGA AGTCGCTCCT 1920 

ACCCCGACGA CACCGACACC GCAGCGGACC TTACCGGCCT GA i 96 2 

(2) INFORMATION FOR SEQ ID NO: 355: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino ac-d 

(C) STRANDEDNESS : Single 
(Dl TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 55: 



Met 


Gly 


His 


His 


His 


His 


His 


His 


Gly 


Ser 


Lys 


Pro 


Pro 


Ser 


Gly Ser 


l 








5 










10 










15 




Pro 


Glu 


Thr 


Gly 


Ala 


Gly 


Ala 


Gly 


Thr 


Val 


Ala 


Thr 


Thr 


Pro 


Ala 


Ser 








20 










25 










30 






Ser 


Pro 


Val 


Thr 


Leu 


Ala 


Glu 


Thr 


Gly 


Ser 


Thr 


Leu 


Leu 


Tyr 


Pro 


Leu 






35 










40 










45 








Phe 


As ii 


Leu 


Trp 


Gly 


Pro 


Ala 


Phe 


.His 


Glu 


Arg 


Tyr 


Pro 


Asn 


val 


Thr 




50 










55 










60 










He 


Thr 


Ala 


Gin 


Gly 


Thr 


Gly 


Ser 


Gly 


Ala 


Gly 


He 


Ala 


Gin 


Ala 


Ala 


65 










70 










75 










80 


Ala 


Gly 


Thr 


Val 


Asn 


He 


Gly 


Ala 


Ser 


Asp 


Ala 


Tyr 


Leu 


Ser 


Glu 


Gly 










35 










90 










95 




Asp 


Met 


Ala 


Ala 


His 


Lys 


Gly 


Leu 


Met 


Asn 


He 


Ala 


Leu 


Ala 


He 


Ser 
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100 105 no 

Ala Gin Gin Val Asn Tyr Asn Leu Pro Gly Val Ser Glu His Leu Lys 

US 120 12 5 

Leu Asn Gly Lys Val Leu Ala Ala Met Tyr Gin Gly Thr lie Lys Thr 

130 135 140 

Trp Asp Asp Pro Gin lie Ala Ala Leu Asn Pro Gly Val Asn Leu Pro 
145 1S0 155 160 

Gly Thr Ala Val Val Pro Leu His Arg Ser Asp Gly Ser Gly Asp Thr 

165 170 175 

Phe Leu Pbe Thr Gin Tyr Leu Ser Lys Gin Asp Pro Glu Gly Trp Gly 

180 185 i9o 

Lys ser Pro Gly Phe Gly Thr Thr Val Asp Phe Pro Ala Val Pro Gly 

l* 5 200 205 

Ala Leu Gly Glu Asn Gly Asn Gly Gly Met Val Thr Gly Cys Ala Glu 

210 215 220 

Thr Pro Gly Cys Val Ala Tyr He Gly He Ser Phe Leu Asp Gin Ala 
225 230 235 240 

Ser Gin Arg Gly Leu Gly Glu Ala Gin Leu Gly Asn Ser Ser Gly Asn 

245 250 255 

Phe Leu Leu Pro Asp Ala Gin Ser lie Gin Ala Ala Ala Ala Gly Phe 

260 265 270 

Ala Ser Lys Thr Pro Ala Asn Gin Ala He Ser Met He Asp Gly Pro 

275 280 285 

Ala Pro Asp Gly Tyr Pro He He Asn Tyr Glu Tyr Ala He Val Asn 

290 295 300 

Asn Arg Gin Lys Asp Ala Ala Thr Ala Gin Thr Leu Gin Ala Phe Leu 
305 310 315 320 

His Trp Ala He Thr Asp Gly Asn Lys Ala Ser Phe Leu Asp Gin Val 

32S 330 335 

His Phe Gin Pro Leu Pro Pro Ala Val Val Lys Leu Ser Asd Ala Leu 

, 340 345 350 

— e Ala Thr He Ser Ser Gly Gly Gly Ser Gly Gly Gly Ser Gly Gly 

355 360 3 65 

Gly Ser Gly Gly Ser Val Pro Thr Thr Ala Ala Ser Pro Pro Ser 

370 375 380 

Ala Ala Ala Pro Pro Ala Pro Ala Thr Pro Val Ala Pro Pro Pro Pro 

3 f S 390 395 400 

Ala Ala Ala Asn Thr Pro Asn Ala Gin Pro Gly Asp Pro Asn Ala Ala 

405 410 415 

Pro Pro Pro Ala Asp Pro Asn Ala Pro Pro Pro Pro Val He Ala Pro 

420 425 430 

Asn Ala Pro Gin Pro Val Arg He Asp Asn Pro Val Gly Gly Phe Ser 
435 440 445 

Phe Ala Leu Pro Ala Gly Trp Val Glu Ser Asp Ala Ala His Phe Asp 

450 455 4 6 o 

Tyr Gly Ser Ala Leu Leu Ser Lys Thr Thr Gly Asd Pro Pro Phe Pro 

!" , 470 475 " 480 

Gly Gin Pro Pro Pro Val Ala Asn Asp Thr Arg He Val Leu Gly Arg 

485 490 435 

Leu Asp Gin Lys Leu Tyr Ala Ser Ala Glu Ala Thr Asp Ser Lys Ala 

500 Jos S10 

Ala Ala Arg Leu Gly Ser Asp Met Gly Glu Phe Tyr Met Pro Tyr Pro 

515 520 525 

Gly Thr Arg He Asn Gin Glu Thr Val Ser Leu Asd Ala Asn Gly Vai 
530 535 540 



WO 99/42076 



PCT/US99/03268 



276 



Ser Gly Ser Ala 
545 

Pro Asn Gly Gin 

Ala Pro Asp Ala 
580 

Thr Ala Asa Asn 
595 

Ser He Arg Pro 
610 

Ala Glu Pro Ala 
625 

Pro Thr Thr Pro 



Ser Tyr Tyr Glu 
550 

He Trp Thr Gly 
565 

Gly Pro Pro Gin 

Pro Val Asp Lys 
600 

Leu Val Ala Pro 
615 

Pro Ala Pro Ala 
630 

Thr Pro Gin Arg 
645 



Val Lys Phe Ser 
555 

Val He Gly Ser 
570 

Arg Trp Phe Val 
585 

Gly Ala Ala Lys 

Pro Pro Ala Pro 
620 

Pro Ala Gly Glu 
635 

Thr Leu Pro Ala 
650 



Asp Pro Ser Lys 
560 

Pro Ala Ala Asn 

575 

Val Trp Leu Gly 
590 

Ala Leu Ala Glu 
605 

Ala Pro Ala Pro 

Val Ala Pro Thr 
640 
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CLAIMS 

1. A polypeptide comprising an immunogenic portion of a soluble 
M. tuberculosis antigen, or a variant of said antigen that differs only in conservative 
substitutions and/or modifications, wherein said antigen has an N-tenninal sequence selected 
from the group consisting of: 

(a) Asp-Pro- Val-Asp-Ala- Val-Ile-Asn-Thr-Thr-Cys-Asn-Tyr-Gly-Gln- 
Val-Val- Ala-Ala-Leu; (SEQ ID No. 120) 

(b) Ala-Val-Glu-Ser-Gly-Met-Leu-Ala-Leu-Gly-Thr-Pro-Ala-Pro-Ser. 
(SEQ ID No. 121) 

(c) Ala-Ala-Met-Lys-Pro-Arg-Thr-Gly-Asp-Gly-Pro-Leu-Glu-Ala-Ala- 
Lys-Glu-Gly-Arg; (SEQ ID No. 122) 

(d) Tyr-Tyr-Trp-Cys-Pro-Gly-Gln-Pro-Phe-Asp-Pro-Ala-Trp-Gly-Pro; 
(SEQ ID No. 123) 

(e) Asp-Ile-Gly-Ser-Glu-Ser-Thr-Glu-Asp-Gln-Gln-Xaa-AJa-Val; (SEQ 
ID No. 124) 

(f) Ala-Glu-Glu-Ser-Ile-Ser-Thr-Xaa-Glu-Xaa-Ile-Val-Pro: (SEQ ID No. 
125) 

(g) Asp-Pro-Glu-Pro-Ala-Pro-Pro-Val-Pro-Thr-Thr-Ala-Ala-Ser-Pro-Pro- 
Ser. (SEQ ID No. 126) 

(h) Ala-Pro-Lys-Tnr-Tyr-Xaa-Glu-Glu-Leu-Lys-Gly-Thr-Asp-Thr-Gly; 
(SEQ ID No. 127) 

(i) Asp-Pro-Ala-Ser-Ala-Pro-Asp-Val-Pro-Thr-Ala-Ala-Gln-Leu-Thr-Ser- 
Leu-Leu-Asn-Ser-Leu-Ala-Asp-Pro-Asn-Val-Ser-Phe-Ala-Asn: (SEQ 
ID No. 128) and 

0') Ala-Pro-Glu-Ser-Gly-Ala-Gly-Leu-Gly-Gly-Thr-Val-Gln-Ala-Gly; 
(SEQ ID No. 136) 
wherein Xaa may be any amino acid. 

2. A polypeptide comprising an immunogenic portion of an 
M. tuberculosis antigen, or a variant of said antigen that differs only in conservative 
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substitutions and/or modifications, wherein said antigen has an N-tenninal sequence selected 
from the group consisting of: 

(a) Asp-Pro-Pro- Asp-PrcKHis-Gln-Xaa-Asp-Met-Thr-Lys-Gly-Tyr-Tyr- 
Pro-Gly-Gly-Arg-Arg-Xaa-Phe; (SEQ ID No. 129) and 

(b) Xaa-Tyr-Ile-Ala-Tyr-Xaa-Thr-Thr-AIa-Gly-Ile-Val-Pro-Gly-Lys-Ile- 
Asn-Val-His-Leu-Val; (SEQ ID No. 137), wherein Xaa may be any 
amino acid. 

3. A polypeptide comprising an immunogenic portion of a soluble 
M. tuberculosis antigen, or a variant of said antigen that differs only in conservative 
substitutions and/or modifications, wherein said antigen comprises an amino acid sequence 
encoded by a DNA sequence selected from the group consisting of the sequences recited in 
SEQ ID Nos.: 1, 2, 4-10, 13-25, 52, 99 and 101, the complements of said sequences, and 
DNA sequences that hybridize to a sequence recited in SEQ ID Nos.: 1. 2, 4-10. 13-25, 52, 
99 and 101 or a complement thereof under moderately stringent conditions. 

4. A polypeptide comprising an immunogenic portion of a 
M. tuberculosis antigen, or a variant of said antigen that differs only in conservative 
substitutions and/or modifications, wherein said antigen comprises an amino acid sequence 
encoded by a DNA sequence selected from the group consisting of the sequences recited in 
SEQ ID Nos.: 26-51. 138, 139. 163-183, 201, 240, 242-247. 253-256. 295-298, 309. 316. 
318-320. 322, 324, 328, 329. 333. 335. 337. 339 and 341. the complements of said sequences, 
and DNA sequences that hybridize to a sequence recited in SEQ ID Nos.: 26-51. 138. 139. 
163-183. 201. 240, 242-247. 253-256. 295-298. 309. 316, 318-320. 322, 324. 328, 329. 333, 
335, 337. 339 and 341 or a complement thereof under moderately stringent conditions. 

5. A DNA molecule comprising a nucleotide sequence encoding a 
polypeptide according to any one of claims 1-4. 

6. An expression vector comprising a DNA molecule according to 

claim 5. 
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7. A host cell transformed with an expression vector according to claim 6. 

8. The host cell of claim 7 wherein the host cell is selected from the group 
consisting of E. coli, yeast and mammalian cells. 

9. A pharmaceutical composition comprising one or more polypeptides 
according to any one of claims 1-4 and a physiologically acceptable carrier. 

10. A pharmaceutical composition comprising one or more DNA 
molecules according to claim 5 and a physiologically acceptable carrier. 

11. A pharmaceutical composition comprising one or more DNA 
sequences recited in SEQ ID Nos.: 3, 11, 12, 140, 141, 156-160, 189-193, 199, 200, 203, 
215-225, 237, 239. 261-276. 292, 293, 303-308. 310-315. 317, 321, 323, 325-327, 330-332. 
334. 336. 338. 340 and 342-347; and a physiologically acceptable carrier. 

12. A vaccine comprising one or more polypeptides according to any one 
of claims 1-4 and a non-specific immune response enhancer. 

13. A vaccine comprising: 

a polypeptide having an N-terminal sequence selected from the group 
consisting of sequences recited in SEQ ID NO: 134 and 135; and 
a non-specific immune response enhancer. 

14. A vaccine comprising: 

one or more polypeptides encoded by a DNA sequence selected from the 
group consisting of SEQ ID Nos.: 3. 11, 12. 140. 141, 156-160. 189-193, 199,200,203.215- 
225, 237, 239, 261-276, 292. 293, 303-308. 310-315, 317, 321, 323. 325-327. 330-332. 334, 
336, 338, 340 and 342-347, the complements of said sequences, and DNA sequences that 
hybridize to a sequence recited in SEQ ID Nos.: 3, 1 1, 12, 140, 141. 156-160. 189-193, 199, 
200, 203. 215-225, 237. 239. 261-276. 292. 293. 303-308. 310-315. 317. 321. 323, 325-327. 
330-332. 334. 336. 338. 340 and 342-347: and 
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a non-specific immune response enhancer. 

15. The vaccine of claims 12-14 wherein the non-specific immune 
response enhancer is an adjuvant. 

16. A vaccine comprising one or more DNA molecules according to claim 
5 and a non-specific immune response enhancer. 

17. A vaccine comprising one or more DNA sequences recited in SEQ ID 
Nos.: 3, 11, 12, 140, 141, 156-160, 189-193, 199. 200, 203, 215-225, 237, 239, 261-276, 
292, 293, 303-308, 310-315, 317, 321, 323, 325-327, 330-332, 334, 336, 338, 340 and 342- 
347; and a non-specific immune response enhancer. 

18. The vaccine of claims 16 or 17 wherein the non-specific immune 
response enhancer is an adjuvant. 

19. A method for inducing protective immunity in a patient, comprising 
administering to a patient a pharmaceutical composition according to any one of claims 9-11. 

20. A method for inducing protective immunity in a patient comprising 
administering to a patient a vaccine according to any one of claims 12-18. 

21. A fusion protein comprising two or more polypeptides according to 
any one of claims 1-4. 

22. A fusion protein comprising one or more polypeptides according to 
any one of claims 1-4 and ESAT-6. 



23. A fusion protein comprising one or more polypeptides according to 
any one of claims 1-4 and the M. tuberculosis antigen 38 kD (SEQ ID NO: 155). 
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24. A pharmaceutical composition comprising a fusion protein according 
to any one of claims 21-23 and a physiologically acceptable carrier. 

25. A vaccine comprising a fusion protein according to any one of claims 
21-23 and a non-specific immune response enhancer. 

26. The vaccine of claim 25 wherein the non-specific immune response 
enhancer is an adjuvant. 

27. A method for inducing protective immunity in a patient comprising 
administering to a patient a pharmaceutical composition according to claim 24. 

28. A method for inducing protective immunity in a patient, comprising 
administering to a patient a vaccine according to claims 25 or 26. 

29. A method for detecting tuberculosis in a patient, comprising: 

(a) contacting dermal cells of a patient with one or more polypeptides 
according to any one of claims 1-4; and 

(b) detecting an immune response on the patient's skin and therefrom 
detecting tuberculosis in the patient. 

30. A method for detecting tuberculosis in a patient, comprising: 

(a) contacting dermal cells of a patient with a polypeptide having an N- 
terminal sequence selected from the group consisting of sequences recited in SEQ ID NO: 
134 and 135; and 

(b) detecting an immune response on the patient's skin and therefrom 
detecting tuberculosis in the patient. 

31. A method for detecting tuberculosis in a patient, comprising: 

(a) contacting dermal cells of a patient with one or more polypeptides 
encoded by a DNA sequence selected from the group consisting of SEQ ID Nos.: 3, IK 12, 
140, 141. 156-160. 189-193, 199. 200. 203. 215-225. 237. 239. 261-276. 292. 293. 303-308. 
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310-315, 317, 321, 323, 325-327, 330-332, 334, 336, 338, 340 and 342-347, the complements 
of said sequences, and DNA sequences that hybridize to a sequence recited in SEQ ID Nos.: 
3, 1 1, 12, 140, 141, 156-160, 189-193, 199, 200, 203, 215-225, 237, 239, 261-276, 292, 293, 
303-308, 310-315, 317, 321, 323, 325-327, 330-332, 334, 336, 338, 340 and 342-347; and 

(b) detecting an immune response on the patient's skin and therefrom 
detecting tuberculosis in the patient. 



32. The method of any one of claims 29-3 1 wherein the immune response 

is induration. 



33. A diagnostic kit comprising: 

(a) a polypeptide according to any one of claims 1-4; and 

(b) apparatus sufficient to contact said polypeptide with the dermal cells of 

a patient. 



34. A diagnostic kit comprising: 

(a) a polypeptide having an N-terminal sequence selected from the group 
consisting of sequences recited in SEQ ID NO: 134 and 135; and 

(b) apparatus sufficient to contact said polypeptide with the dermal cells of 

a patient. 

35. A diagnostic kit comprising: 

(a) a polypeptide encoded by a DNA sequence selected from the group 
consisting of SEQ ID Nos.: 3, 11. 12, 140. 141. 156-160. 189-193. 199, 200, 203. 215-225, 
237, 239. 261-276. 292. 293, 303-308. 310-315, 317, 321. 323. 325-327, 330-332, 334, 336. 
338. 340 and 342-347, the complements of said sequences, and DNA sequences that 
hybridize to a sequence recited in SEQ ID Nos.: 3, 1 1, 12, 140. 141, 156-160, 189-193, 199, 
200, 203. 215-225, 237, 239, 261-276, 292, 293, 303-308. 310-315, 317, 321. 323, 325-327, 
330-332. 334, 336, 338, 340 and 342-347; and 

(b) apparatus sufficient to contact said polypeptide with the dermal cells of 

a patient. 
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36. A diagnostic kit comprising: 

(a) a fusion protein according to any one of claims 21-23; and 

(b) apparatus sufficient to contact said fusion protein with the deimal cells of a 
patient. 

37. A fusion protein according to claim 23 comprising an amino acid 
sequence selected from the group consisting of sequences recited in SEQ ED NO: 153, 209, 
351 and 355. 
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